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Abstract 



In this paper we give a self contained introduction to the conceptional and mathe- 
matical foundations of quantum information theory. In the first part we introduce 
the basic notions like entanglement, channels, tclcportation etc. and their mathe- 
matical description. The second part is focused on a presentation of the quantitative 
aspects of the theory. Topics discussed in this context include: entanglement mea- 
sures, channel capacities, relations between both, additivity and continuity proper- 
ties and asymptotic rates of quantum operations. Finally we give an overview on 
some recent developments and open questions. 
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Chapter 1 
Introduction 



Quantum information and quantum computation have recently attracted a lot 
of interest. The promise of new technologies like safe cryptography and new "super 
computers" , capable of handling otherwise untractable problems, has excited not 
only researchers from many different fields like physicists, mathematicians and com- 
puter scientists, but also a large public audience. On a practical level all these new 
visions are based on the ability to control the quantum states of (a small number 
of) micro systems individually and to use them for information transmission and 
processing. From a more fundamental point of view the crucial point is a recon- 
sideration of the foundations of quantum mechanics in an information theoretical 
context. The purpose of this work is to follow the second path and to guide physi- 
cists into the theoretical foundations of quantum information and some of the most 
relevant topics of current research. 

To this end the outline of this paper is as follows: The rest of this introduction 
is devoted to a rough and informal overview of the field, discussing some of its tasks 
and experimental realizations. Afterwards, in Chapter ^, we will consider the basic 
formalism which is necessary to present more detailed results. Typical keywords 
in this context are: systems, states, observables, correlations, entanglement and 
quantum channels. We then clarify these concepts (in particular entanglement and 
channels) with several examples in Chapter ^, and in Chapter |^ we discuss the most 
important tasks of quantum information in greater detail. The last three Chapters 
are devoted to a more quantitative analysis, where we make closer contact to current 
research: In Chapter ||we will discuss how entanglement can be measured. The topic 
of Chapter]^ are channel capacities, i.e. we are looking at the amount of information 
which can maximally be transmitted over a noisy channel and in Chapter ^ we 
consider state estimation, optimal cloning and related tasks. 

Quantum information is a rapidly developing field and the present work can of 
course reflect only a small part of it. An incomplete list of other general sources the 



reader should consult is: the books of Lo Gruska |7^, Nielsen and Chuang 

]122( |, Bouwmeester et. al. and Alber et. aL0, the lecture notes of Preskill [130| 
and the collection of references by Cabello |37|] which particularly contains many 
references to other reviews. 

1.1 What is quantum information? 

Classical information is, roughly speaking, everything which can be transmitted 
from a sender to a receiver with "letters" from a "classical alphabet" e.g. the two 
digits "0" and "1" or any other finite set of symbols. In the context of classical 
information theory, it is completely irrelevant which type of physical system is used 
to perform the transmission. This abstract approach is successful because it is easy 
to transform information between different types of carriers like electric currents in 
a wire, laser pulses in an optical fiber, or symbols on a piece of paper without loss 
of data; and even if there are losses they are well understood and it is known how 
to deal with them. However, quantum information theory breaks with this point of 
view. It studies, loosely speaking, that kind of information ( "quantum information" ) 
which is transmitted by micro particles from a preparation device (sender) to a 
measuring apparatus (receiver) in a quantum mechanical experiment - in other 
words the distinction between carriers of classical and quantum information becomes 
essential. This approach is justified by the observation that a lossless conversion of 
quantum information into classical information is in the above sense not possible. 



1. Introduction 



6 



Therefore, quantum information is a new kind of information. 

In order to explain why there is no way from quantum to classical information 
and back, let us discuss how such a conversion would look like. To convert quantum 
to classical information we need a device which takes quantum systems as input 
and produces classical information as output - this is nothing else than a measuring 
apparatus. The converse translation from classical to quantum information can be 
rephrased similarly as "parameter dependent preparation" , i.e. the classical input to 
such a device is used to control the state (and possibly the type of system) in which 
the micro particles should be prepared. A combination of these two elements can be 
done in two ways. Let us first consider a device which goes from classical to quantum 
to classical information. This is a possible task and in fact technically realized 
already. A typical example is the transmission of classical information via an optical 
fiber. The information transmitted through the fiber is carried by micro particles 
(photons) and is therefore quantum information (in the sense of our preliminary 
definition). To send classical information we have to prepare first photons in a 
certain state send them through the channel and measure an appropriate observable 
at the output side. This is exactly the combination of a classical — > quantum with 
a quantum — > classical device just described. 

The crucial point is now that the converse composition - performing the mea- 
surement M first and the preparation P afterwards (cf. Figure 1.1) - is more prob- 



lematic. Such a process is called classical teleportation, if the particles produced by 
P are "indistinguishable" from the input systems. We will show the impossibility 
of such a device via a hierarchy of other "impossible machines" which traces the 
problem back to the fundamental structure of quantum mechanics. This finally will 
prove our statement that quantum information is a new kind of information^. 



Measurement 



Preparation 



— NAAAAA/^ 



M 




P 


> 
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Figure 1.1: Schematic representation of classical teleportation. Here and in the fol- 
lowing diagrams a curly arrow stands for quantum systems and a straight one for 
the flow of classical information. 



To start with, we have to clarify the precise meaning of "indistinguishable" in 
this context. This has to be done in a statistical way, because the only possibility to 
compare quantum mechanical systems is in terms of statistical experiments. Hence 
we need an additional preparation device P' and an additional measuring apparatus 
M'. Indistinguishable now means that it does not matter whether we perform M' 
measurements directly on P' outputs or whether we switch a teleportation device 



in between; cf. Figure 1.2. In both cases we should get the same distribution of 
measuring results for a large number of repetitions of the corresponding experiment. 
This requirement should hold for any preparation P' and any measurement M', 
but for fixed M and P. The latter means that we are not allowed to use a priori 
knowledge about P' or M' to adopt the teleportation process (otherwise we can 
choose in the most extreme case always P' for P and the whole discussion becomes 
meaningless) . 

The second impossible machine we have to consider is a quantum copying ma- 



-"^The following chain of arguments is taken from [169|, where it is presented in greater detail. 
This concerns in particular the construction of Bell's telephone from a joint measurement, which 
we have omitted here. 
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Figure 1.2: A teleportation process should not affect the results of a statistical 
experiment with quantum systems. A more precise explanation of the diagram is 
given in the text. 



chine. This is a device C which takes one quantum system p as input and produces 
two systems pi,P2 of the same type as output. The limiting condition on C is that 
pi and p2 are indistinguishable from the input, where "indistinguishable" has to be 
understood in the same way as above: Any statistical experiment performed with 
one of the output particles (i.e. always with pi or always with P2) yields the same 
result as applied directly to the input p. To get such a device from teleportation 
is easy: We just have to perform an M measurement on p, make two copies of the 
classical data obtained, and run the preparation P on each of them; cf. Figure 1.3. 
Hence if teleportation is possible copying is possible as well. 



According to the "no-cloning theorem" of Wootters and Zurek |173 , however, a 



quantum copy machine does not exist and this basically concludes our proof. How- 
ever we will give an easy argument for this theorem in terms of a third impossible 
machine - a joint measuring device Mab for two arbitrary observables A and B. 
This is a measuring apparatus which produces each time it is invoked a pair (a, h) 
of classical outputs, where a is a possible output of A and b a possible output of 
B. The crucial requirement for Mab again is of statistical nature: The statistics of 
the a outcomes is the same as for device A, and similarly for B. It is known from 
elementary quantum mechanics that many quantum observables are not jointly 
measurable in this way. The most famous examples are position and momentum or 
different components of angular momentum. Nevertheless a device Mab could be 
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Figure 1.3: Constructing a quantum copying machine from a teleportation device. 
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Figure 1.4: Constructing a joint measurement for the observables A and B from a 
quantum copying machine. 



constructed for arbitrary A and B from a quantum copy machine C. We simply have 
to operate with C on the input system p producing two outputs pi and £2 and to 



perform an A measurement on pi and a B measurement on p2; cf. Figure 1.4. Since 
the outputs pi, p2 are, by assumption indistinguishable from the input p the overall 
device constructed this way would give a joint measurement for A and B. Hence a 
quantum copying machine cannot exist, as stated by the no-cloning theorem. This 
in turn implies that classical teleportation is impossible, and therefore we can not 
transform quantum information lossless into classical information and back. This 
concludes our chain of arguments. 

1.2 Tasks of quantum information 

So we have seen that quantum information is something new, but what can we do 
with it? There are three answers to this question which we want to present here. 
First of all let us remark that in fact all information in a modern data processing 
environment is carried by micro particles (e.g. electrons or photons). Hence quantum 
information comes automatically into play. Currently it is safe to ignore this and 
to use classical information theory to describe all relevant processes. If the size of 
the structures on a typical circuit decreases below a certain limit, however, this is 
no longer true and quantum information will become relevant. 

This leads us to the second answer. Although it is far too early to say which 
concrete technologies will emerge from quantum information in the future, several 
interesting proposals show that devices based on quantum information can solve 
certain practical tasks much better than classical ones. The most well known and 
exciting one is, without a doubt, quantum computing. The basic idea is, roughly 
speaking, that a quantum computer can operate not only on one number per reg- 
ister but on superpositions of numbers. This possibility leads to an "exponential 
speedup" for some computations which makes problems feasible which are consid- 
ered intractable by any classical algorithm. This is most impressively demonstrated 



by Shor's factoring algorithm |139, 14C|. A second example which is quite close 



to a concrete practical realization (i.e. outside the laboratory; see next Section) is 
quantum cryptography. The fact that it is impossible to perform a quantum me- 
chanical measurement without disturbing the state of the measured system is used 
here for the secure transmission of a cryptographic key (i.e. each eavesdropping 
attempt can be detected with certainty). Together with a subsequent application 
of a classical encryption method known as the "one-time" pad this leads to a cryp- 
tographic scheme with provable security - in contrast to currently used public key 
systems whose security relies on possibly doubtful assumptions about (pseudo) ran- 
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dom number generators and prime numbers. We will come back to both subjects 



quantum computing and quantum cryptography in Sections 4.5 and 4.6. 

The third answer to the above question is of more fundamental nature. The dis- 
cussion of questions from information theory in the context of quantum mechanics 
leads to a deeper and in many cases more quantitative understanding of quantum 
theory. Maybe the most relevant example for this statement is the study of en- 
tanglement, i.e. non-classical correlations between quantum systems, which lead to 
violations of Bell inequalitie^. Entanglement is a fundamental aspect of quantum 
mechanics and demonstrates the differences between quantum and classical physics 
in the most drastical way - this can be seen from Bell-type experiments, like the 
one of Aspect et. al. and the discussion about. Nevertheless, for a long time it 
was only considered as an exotic feature of the foundations of quantum mechanics 
which is not so relevant from a practical point of view. Since quantum information 
attained broader interest, however, this has changed completely. It has turned out 
that entanglement is an essential resource whenever classical information process- 
ing is outperformed by quantum devices. One of the most remarkable examples is 



the experimental realization of "entanglement enhanced" teleportation l24l E3] . We 



have argued in Section 1.1 that classical teleportation, i.e. transmission of quantum 
information through a classical information channel, is impossible. If sender and 
receiver share, however, an entangled pair of particles (which can be used as an 
additional resource) the impossible task becomes, most surprisingly, possible ]ll[| ! 
(We will discuss this fact in detail in Section |4T| .) The study of entanglement and 
in particular the question how it can be quantified is therefore a central topic within 
quantum information theory (cf. Chapter ^). Further examples for fields where 
quantum information has led to a deeper and in particular more quantitative in- 
sight include "capacities" of quantum information channels and "quantum cloning" . 
A detailed discussion of these topics will be given in Chapter ^ and ^. Finally let 
us remark that classical information theory benefits in a similar way from the syn- 
thesis with quantum mechanics. Beside the just mentioned channel capacities this 
concerns for example the theory of computational complexity which analyzes the 
scaling behavior of time and space consumed by an algorithm in dependence of the 
size of the input data. Quantum information challenges here in particular the fun- 



damental Church- Turing hypotheses 152] which claims that each computation 



can be simu lated "efficiently" on a Turing machine; we come back to this topic in 
Section IJ. 

1.3 Experimental realizations 

Although this is a theoretical paper, it is of course necessary to say something 
about experimental realizations of the ideas of quantum information. Let us consider 
quantum computing first. Whatever way we go here, we need systems which can 
be prepared very precisely in few distinct states (i.e. we need "qubits"), which can 
be manipulated afterwards individually (we have to realize "quantum gates") and 
which can finally be measured with an appropriate observable (we have to "read 
out" the result). 

One of the most far developed approaches to quantum computing is the ion trap 



technique (see Section 4.3 and 5.3 in and Section 7.6 of [122| for an overview 
and further references). A "quantum register" is realized here by a string of ions 
kept by electromagnetic fields in high vacuum inside a Paul trap, and two long- 
living states of each ion are chosen to represent "0" and "1" . A single ion can be 
manipulated by laser beams and this allows the implementation of all "one-qubit 
gates" . To get two-qubit gates as well (for a quantum computer we need at least one 
two qubit gate together with all one-qubit operations; cf. Section |4.5|) the collective 



^This is only a very rough characterization. A more precise one will be given in Section 



2.2 
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motional state of the ions has to be used. A "program" on an ion trap quantum 
computer starts now with a preparation of the register in an initial state - usually 
the ground state of the ions. This is done by optical pumping and laser cooling 
(which is in fact one of the most difficult parts of the whole procedure, in particular 
if many ions are involved). Then the "network" of quantum gates is applied, in 
terms of a (complicated) sequence of laser pulses. The readout finally is done by 
laser beams which illuminate the ions subsequently. The beams are tuned to a fast 
transition which affects only one of the qub it states and the fluorescent light is 
detected. Concrete implementations (see e.g. |118, |l02|] ) are currently restricted to 
two qubits, however there is some hope that we will be able to control up to 10 or 
12 qubits in the not too distant future. 

A second quite successful technique is NMR quantum computing (see Section 
5.4 of 1^ and Section 7.7 of |122] together with the references therein for details). 
NMR stands for "nuclear magnetic resonance" and it is the study of transitions 
between Zeeman levels of an atomic nucleus in a magnetic field. The qubits are in 
this case different spin states of the nuclei in an appropriate molecule and quantum 
gates are realized by high frequency oscillating magnetic fields in pulses of controlled 
duration. In contrast to ion traps however we do not use one molecule but a whole 
cup of liquid containing some lO^'' of them. This causes a number of problems, 
concerning in particular the preparation of an initial state, fluctuations in the free 
time evolution of the molecules and the readout. There are several ways to overcome 
these difficulties and we refer the reader again to Q and ]122{ for details. Concrete 
implementations of NMR quantum computers are capable to use up to five qubits 
|113|. Other realizations include the implementation of several known quantum 
algorithms on two and three qubits; see e.g. 109 1. 



The fundamental problem of the two methods for quantum computation dis- 
cussed so far, is their lack of scalability. It is realistic to assume that NMR and 
ion-trap quantum computer with up to tens of qubits will exist somewhen in the 
future but not with thousands of qubits which are necessary for "real world" appli- 
cations. There are, however, many other alternative proposals available and some 
of them might be capable to avoid this problem. The following is a small (not at all 
exhaustive) list: atoms in optical lattices p8| , semiconductor nanostructures such as 
quantum dots (there are many wor ks in this area, some recent are ||l49| , |l , 29 ) 
and arrays of Josephson junctions [112|. 

A second circle of experiments we want to mention here is grouped around 
quantum communication and quantum cryptography (for a more detailed overview 
let us refer to |163|| and ||6^ ). Realizations of quantum cryptography are fairly far 
developed and it is currently possible to span up to 50km with optical fibers (e.g. 
|p3| ). Potentially greater distances can be bridged by "free space cryptography" 
where the quantum information is transmitted through the air (e.g |]3^ ). With this 
technology satellites can be used as some sort of "relays" , thus enabling quantum 
key distribution over arbitrary distances. In the meantime there are quite a lot 
of successful implementations. For a detailed discussion we will refer the reader 
to the review of Gisin et. al. |^ and the references therein. Other experiments 
concern the usage of entanglement in quantum communication. The creation and 
detection of entangled photons is here a fundamental building block. Nowadays this 
is no problem and the most famous experiment in this context is the one of Aspect 
et. al. j^, where the maximal violation of Bell inequalities was demonstrated with 
polarization correlated photons. Another spectacular experiment is the creation 
of entangled photons over a distance of 10 km using standard telecommunication 
optical fibers by the Geneva group |151]. Among the most exciting applications 
of entanglement is the realization of entanglement based quantum key distribution 
p5[ , the first successful "teleportation" of a photon [p3, [2^ and the implementation 



of "dense coding" |115| ; cf. Section |4J 



Chapter 2 
Basic concepts 



After we have got a first, rough impression of the basic ideas and most rel- 
evant subjects of quantum information theory, let us start with a more detailed 
presentation. First we have to introduce the fundamental notions of the theory and 
their mathematical description. Fortunately, much of the material we should have 
to present here, like Hilbert spaces, tensor products and density matrices, is known 
already from quantum mechanics and we can focus our discussion to those concepts 
which are less familiar like POV measures, completely positive maps and entangled 
states. 

2.1 Systems, States and Effects 

As classical probability theory quantum mechanics is a statistical theory. Hence its 
predictions are of probabilistic nature and can only be tested if the same experiment 
is repeated very often and the relative frequencies of the outcomes are calculated. 
In more operational terms this means: the experiment has to be repeated according 
to the same procedure as it can be set out in a detailed laboratory manual. If we 
consider a somewhat idealized model of such a statistical experiment we get in 
fact two different types of procedures: first preparation procedures which prepare 
a certain kind of physical system in a distinguished state and second registration 
procedures measuring a particular observable. 

A mathematical description of such a setup basically consists of two sets § and 
£ and a map § x £ 9 (p, A) ^ p{A) G [0, 1]. The elements of § describe the states, 
i.e. preparations, while the A G 8. represent all yes/no measurements (effects) which 
can be performed on the system. The probability (i.e. the relative frequency for a 
large number of repetitions) to get the result "yes" , if we are measuring the effect 
A on a system prepared in the state p, is given by p{A). This is a very general 
scheme applicable not only to quantum mechanics but also to a very broad class 
of statistical models, containing in particular classical probability. In order to make 
use of it we have to specify of course the precise structure of the sets S and £ and 
the map p{A) for the types of systems we want to discuss. 

2.1.1. Operator algebras. — Throughout this paper we will encounter three dif- 
ferent kinds of systems: quantum and classica l syste ms and hybrid systems which 



are half classical, half quantum (cf. Subsection [2.2.21) . In this subsection we will de- 
scribe a general way to define states and effects which is applicable to all three cases 
and which therefore provides a handy way to discuss all three cases simultaneously 



(this will become most useful in Section 2.2 and 2.3) 



The scheme we are going to discuss is based on an algebra A of bounded op- 
erators acting on a Hilbert space More precisely A is & (closed) linear sub- 
space of 23 (J{), the algebra of bounded operates on 5{, which contains the identity 
(I e A) and is closed under products (A, B G A ^ AB £ A) and adjoints [A £ A 
^ A* £ A). For simplicity we will refer to each such A as an observable algebra. 
The key observation is now that each type of system we will study in the following 
can be completely characterized by its observable algebra A^ i.e. once A is known 
there is a systematic way to derive the sets § and £ and the map (p. A) i-^ p{A) 
from it. We frequently make use of this fact by referring to systems in terms of their 
observable algebra A^ or even by identifying them with their algebra and saying 
that A is the system. 

Although A and !K can be infinite dimensional in general, we will consider only 
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finite dimensional Hilbert spaces, as long as nothing else is explicitly stated. Since 
most research in quantum information is done up to now for finite dimensional 
systems (the only exception in this work is the discussion of Gaussian systems in 
Section ^.3[ ) this is not a too severe loss of generality. Hence we can choose !K = 
and !B(3-C) is just the algebra of complex d x d matrices. Since yi is a subalgebra 
of 'B{!K) it operates naturally on 5{ and it inherits from 25 the operator norm 
\\A\\ — supji^ii^]^ llAi/;!! and the operator ordering A > _B {ip,Aip) > (ip^Bip) 
y-ip e "K. Now we can define: 

§{A) = {peA*\p>0,p{l)^l} (2.1) 

where A* denotes the dual space of A, i.e. the set of all linear functionals on A, and 
p > means p{A) > VA > 0. Elements of §{A) describe the states of the system 
in question while effects are given by 

E,{A) = {AeA\A>0, A<1}. (2.2) 

The probability to measure the effect A in the state p is p{A). More generally we can 
look at p{A) for an arbitr ary A as the expectation value of A in the state p. Hence 
the idea behind Equation ( p.l[ ) is to define states in terms of their expectation value 
functionals. 

Both spaces are convex, i.e. p, cr G §{A) and < A < 1 implies Xp + {1 ~ Xja G 
§{A) and similarly for £(^1). The extremal points of 8(^1) respectively £{A), i.e. those 
elements which do not admit a proper convex decomposition (x = Xy + {1 — X)z ^ 
X=^lorX^Oory = z = x), play a distinguished role: the extremal points of 8(^1) 
are pure states and those of £ (A) are the propositions of the system in question. The 
latter represent those effects which register a property with certainty in contrast 
to non-extremal effects which admit some "fuzziness" . As a simple example for the 
latter consider a detector which registers particles not with certainty but only with 
a probability which is smaller than one. 

Finally let us note that the complete discussion of this section can be generalized 
easily to infinite dimensional systems, if we replace !K = by an infinite dimen- 
sional Hilbert space (e.g. J{ = L^(]R)). This would require however more material 
about C* algebras and measure theory than we want to use in this paper. 

2.1.2. Quantum mechanics. — For quantum mechanics we have 

A^'B{:K), (2.3) 

where we have chosen again = C^. The corresponding systems are called d-level 
systems or qubits if d ~ 2 holds. To avoid clumsy notations we frequently write §{!K) 



and £(Jf) instead of §['B(Jf)] and £[!B(J{)]. From Equation (2.2) we immediately 
see that an operator A £ 23 (J{) is an effect iff it is positive and bounded from 
above by I. An element P E £(J{) is a propositions iff P is a projection operator 
(P2 = p). 

States are described in quantum mechanics usually by density matrices, i.e. 
positive and normalized trace clas^ operators. To make contact to the general 



definition in Equation (2.1) note first that 23 (J{) is a Hilbert space with the Hilbert- 
Schmidt scalar product {A,B) — ti{A*B). Hence each linear functional p e 23(J{)* 
can be expressed in terms of a (trace class) operator p by|^ A i-^ p^A) = tr(pA). It is 

^On a finite dimensional Hilbert space this attribute is of course redundant, since each operator 
is of trace class in this case. Nevertheless we will frequently use this terminology, due to greater 
consistency with the infinite dimensional case. 

^ If we consider infinite dimensional systems thi s is not true. In this case the dual space of 
the observable algebra is much larger and Equation (Ejll) leads to states which are not necessarily 
given by trace class operators. Such "singular states" play an important role in theories which 
adiniL an infinite number of degrees of freedom like quantum statistics and quantum field theory; 
cf . pa, ha . For applications of singular states within quantum information see Mn . 
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obvious that each p defines a unique functional p. If we start on the other hand with 
p we can recover the matrix elements of p from p by pkj = tr(p|j)(fc|) — p{\j){k\), 
where |j)(fc| denotes the canonical basis of !B(5{) (i.e. \j){k\ab = SjaSkb)- More 
generally we get for ip^cj) ^ "K the relation {(f>,pip) ~ pdip) {(j>\) , where now 
denotes the rank one operator which maps ?y € !K to {(p^rj)^. In the following we 
drop the and use the same symbol for the operator and the functional whenever 
confusion can be avoided. Due to the same abuse of language we will interpret 
elements of S(3-C)* frequently as (trace class) operators instead of linear functionals 
(and write tr(pA) instead of p{A)). However we do not identify 23 (J{)* with 23 (IK) 
in general, because the two different notations help to keep track of the distinction 
between spaces of states and spaces of observables. In addition we equip 23* (J{) 
with the trace-norm \\p\\i = tr |p| instead of the operator norm. 

Positivity of the functional p implies positivity of the operator p due to 
< p{\ip){ip\) = {4': pi') and the same holds for normalization: 1 = p(I) = tr{p). 



Hence we can identify the state space from Equation {2A) with the set of density 
matrices, as expected for quantum mechanics. Pure states of a quantum system 
are the one dimensional projectors. As usual we will frequently identify the density 
matrix with the wave function ip and call the latter in abuse of language a 

state. 

To get a useful parameterization of the state space consider again the Hilbert- 
Schmidt scalar product (p, a) = ti-{p*a), but now on 23* (J{). The space of trace free 
matrices in 23*(J{) (alternatively the functionals with p(I) = 0) is the corresponding 
orthocomplement I"*" of the unit operator. If we choose a basis CTi, . . . , a'^2_i with 
{aj,ak) = 26 jk in I""" we can write each selfajoint (trace class) operator p with 
tr{p) = 1 as 

p=- + - 51 ^J^J ^+ 2^-^' withf gR'^'-i. (2.4) 

If = 2 or d = 3 holds, it is most natural to choose the Pauli matrices respectively 
the Gell-Mann matrices (cf. e.g. Sect. 13.4 of [Q) for the aj. In the qubit case it is 
easy to see that p > holds iff |x| < 1. Hence the state space S(C^) coincides with 
the Block ball {x Q M.^ \ \x\ < 1}, and the set of pure states with its boundary, the 
Block sphere {x G M.'^ \ \x\ — 1}. This shows in a very geometric way that the pure 
states are the extremal points of the convex set §(!K). If p is more generally a pure 
state of a d-level system we get 

1 1, 



1 = tr(p2) = _ + _|f|2 ^ |f| ^ ^2 (1 - l/d). (2.5) 

This implies that all states are contained in the ball with radius 2^/'^(l — l/d)^/^, 
however not all operators in this set are positive. A simple example is (i^^I±2^/^(l — 
l/d.y/'^aj, which is positive only if d = 2 holds. 

2.1.3. Classical probability. — Since the difference between classical and quan- 
tum systems is an important issue in this work let us reformulate classical probabil- 



ity theory according to the general scheme from Subsection 2.1.1. The restriction to 
finite dimensional observable algebras leads now to the assumption that all systems 
we are considering admit a finite set X of elementary events. Typical examples are: 
throwing a dice X = {!,... ,6}, tossing a coin X = { "head" , "number" } or classical 
bits X — {0, 1}. To simplify the notations we write (as in quantum mechanics) §{X) 
and £{X) for the spaces of states and effects. 

The observable algebra A of such a system is the space 

A^e{X) ^ {f : X ^C} (2.6) 
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of complex valued functions on X. To interpret this as an operator algebra acting 



on a Hilbert space IK (as indicated in Subsection 2.1.1) choose an arbitrary but 
fixed orthonormal basis \x) , x d X in 3{. and identify the function f e C(X) with 
the operator f — fx\x){x\ G !B(IK) (we use the same symbol for the function 
and the operator, provided confusion can be avoided). Most frequently we have 
X — {1, . . . ,d} and we can choose 5f = and the canonical basis for \x). Hence 



G{X) becomes the algebra of diagonal d x d matrices. Using Equation (2.2) we 
immediately see that / € G{X) is an effect iff < < 1, Vx G X. Physically 
we can interpret fx as the probability that the effect / registers the elementary 
event x. This makes the distinction between propositions and "fuzzy" effects very 
transparent: P G £{X) is a proposition iff we have either P^, = 1 or P^, = for all 
X ^ X. Hence the propositions P G are in one to one correspondence with 

the subsets ujp ~ {x £ X \ — 1} d X which in turn describe the events of the 
system. Hence P registers the event up with certainty, while a fuzzy effect f < P 
does this only with a probability less then one. 

Since C{X) is finite dimensional and admits the distinguished basis |a;)(a;|, x G X 
it is naturally isomorphic to its dual C*{X). More precisely: each linear functional 
p G C*{X) defines and is uniquely defined by the function x px = p{\x){x\) and 
we have p{ f) — fxPx- As in the quantum case we will identify the function p 
with the linear functional and use the same symbol for both, although we keep the 
notation C*{X) to indicate that we are talking about states rather than observables. 

Positivity of p G C*(X) is given hy p^ > for all x and normalization leads 
to 1 = p(I) = p{J2x = 'l2xPx- Hence to be a state p G C*{X) must be a 

probability distribution on X and pj is the probability that the elementary event x 
occurs during statistical experiments with systems in the state p. More generally 
Pif) — Pjfj is the probability to measure the effect / on systems in the state p. 
If P is in particular a proposition, p{P) gives the probability for the event uip. The 
pure states of the system are the Dirac measures S^, x € X] with Sx{\y){y\) — Sxy 
Hence each p G §(^) can be decomposed in a unique way into a convex linear 
combination of pure states. 

2.1.4. Observables. — Up to now we have discussed only effects, i.e. yes/no 
experiments. In this subsection we will have a firs t shor t look at more general 



observables. We will come back to this topic in Section 3.2.4 after we have introduced 
channels. We can think of an observable E taking its values in a finite set X as a 
map which associates to each possible outcome x G X the effect £ £■ (A) (if A is 
the observable algebra of the system in question) which is true if x is measured and 
false otherwise. If the measurement is performed on systems in the state p we get 
for each x G X the probability = p{Ex) to measure x. Hence the family of the 
Px should be a probability distribution on X, and this implies that E should be a 
POV measure on X. 

Definition 2.1.1 Consider an observable algebra A C 23 (J{) and a finit^ set X. 
A family E ~ {Ex)xex of effects in A (i.e. < E^ < is called a positive 
operator valued measure (POV measure) on X if ^^ex = I holds. If all E^ 
are projections, E is called projection valued measure (PV measure). 

From basic quantum mechanics we know that observables are described by self 
adjoint operators on a Hilbert space "K. But, how does this point of view fit into 
the previous definition? The answer is given by the spectral theorem (Thm. VIII. 6 



[134 1): Each selfadjoint operator A on a finite dimensional Hilbert space J{ has 
the form A = X^AetrCA) ^^>- ''^here (j{A) denotes the spectrum of A, i.e. the set of 



3 



This is if course an artifical restriction and in majw situations not Justified (cf. in particular 



the discussion of quantum state estimation in Section 4^2 jmd Chapter However, it helps us to 



avoid measure theoretical subtleties; cf. Holevo's book |[rS|] for a more general discussion. 
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eigenvalues and Pa denotes the projection onto the corresponding eigenspace. Hence 
there is a unique PV measure P = {P\)xecr{A) associated to A which is cahed the 
spectral measure of A. It is uniquely characterized by the property that the expecta- 
tion value Xp{P\) of P in the state p is given for any state p by p{A) — tr{pA); 
as it is well known from quantum mechanics. Hence the traditional way to define 
observables within quantum mechanics perfectly fits into the scheme just outlined, 
however it only covers the projection valued case and therefore admits no fuzziness. 
For this reason POV measures are sometimes called generalized observables. 

Finally note that the eigenprojections Px of A are elements of an observable 
algebra A iS A E A. This shows two things: First of all we can consider self adjoint 
elements of any *-subalgebra A of 23 (!K) as observables of yi-systems, and this is 
precisely the reason why we have called A observable algebra. Secondly we see why 
it is essential that A is really a subalgebra of !B(3{): if it is only a linear subspace 
of !B(J{) the relation A E A does not imply Pa G A. 

2.2 Composite systems and entangled states 

composite systems occur in many places in quantum information theory. A typical 
example is a register of a quantum computer, which can be regarded as a system 
consisting of N qubits (if N is the length of the register). The crucial point is that 
this opens the possibility for correlations and entanglement between subsystems. 
In particular entanglement is of great importance, because it is a central resource 
in many applications of quantum information theory like entanglement enhanced 



teleportation or quantum computing - we already discussed this in Section 1.2 of 
the introduction. To explain entanglement in greater detail and to introduce some 
necessary formalism we have to complement the scheme developed in the last section 
by a procedure which allows us to construct states and observables of the composite 
system from its subsystems. In quantum mechanics this is done of course in terms 
of tensor products, and we will review in the following some of the most relevant 
material. 

2.2.1. Tensor products. — Consider two (finite dimensional) Hilbert spaces !K 
and %. To each pair of vectors tpi E 'K, ip2 & ^ we can associate a bilinear form 
■0i(g)-02 called the tensor product of ipi and 'ip2 by V'i®'02(0i, ^2) = {i'l, 4>i){i^2, 4>2)- 
For two product vectors V'l ® "02 and rji ® ri2 their scalar product is defined by 
(■01 ® 0^2, ?7i ® '72) = (01, fyi) ("02, ^72) audit can be shown that this definition extends 
in a unique way to the span of all 0i (E) ip2 which therefore defines the tensor product 
(g) %. If we have more than two Hilbert spaces !Kj, j = 1, . . . ,N their tensor 
product J{i ig) • • • (g) JCat can be defined similarly. 

The tensor product Ai (g) A2 of two bounded operators Ai E 23 (?C), A2 E 23 (3C) 
is defined first for product vectors -01 02 G K X hy Ai A2{tpi (8) "02) = 
(Ai-0i) (g) {A21P2) and then extended by linearity. The space 23 (J{ (g %) coincides 
with the span of all Ai iS) A2. If p E 23 (Jf (g) %) is not of product form (and of 
trace class for infinite dimensional J{ and OC) there is nevertheless a way to define 
"restrictions" to 3i respectively 3C called the partial trace of p. It is defined by the 
equation 



tr[tT3c{p)A] = tr{pA (g I) VA e !B(5f) (2.7) 

where the trace on the left hand side is over 3{ and on the right hand side over 
'K(g>OC. 

If two orthonormal bases 0i , . . . , 0„ and ipi, . . . , ipm are given in !K respectively 
X we can consider the product basis 0i g) -0i, . . . ,(f)n® "0™ in M g) 3<C, and we can 
expand each ^'G?fg)3Cas* = Y.jk ^jk<Pj i'k with "^jk = {4>j ®ipk,^)- This 
procedure works for an arbitrary number of tensor factors. However, if we have 
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exactly a twofold tensor product, there is a more economic way to expand called 
Schmidt decomposition in which only diagonal terms of the form (f)j (8) ^pj appear. 

Proposition 2.2.1 For each element of the twofold tensor product 'H (^^ % there 
are orthonormal systems (pj, j = 1, . . . ,n and ipk, k = 1, . . . ,n (not necessarily 
bases, i. e. n can he smaller than dim !K and dim %) of % and % respectively such 
that ^ = \f\j<pj ® ipj holds. The and ipj are uniquely determined by ^ . The 

expansion is called Schmidt decomposition and the numbers y/Xj are the Schmidt 
coefficients. 

Proof. Consider the partial trace pi = trjcd^*) (^'|) of the one dimensional projector 
|\E')(^'| associated to ^. It can be decomposed in terms of its eigenvectors 0„ and we 
get trgc(|5')(^'|) — pi = '^iil<^n)('^nl- ^ow wc Can choose an orthonormal basis 
ip'i^, k — 1, . . . , n2 in 3C and expand 5" with respect to (pj (g) t/;^. Carrying out the k 
summation we get a family of vectors — ® ''i^'kl^'k '^ith the property 

^ = 'Y^- (pj^ijj'j . Now we can calculate the partial trace and get for any A € 'B(Jfi): 

^A,(0„A0,) =tr(pi^) = (vI/,(A®I)vI/) =^ (2.8) 

J 

Since A is arbitrary we can compare the left and right hand side of this equation 

— 1/2 

term by term and we get {ip'- ,ip'l) — SjkXj. Hence ipj ~ ip'- is the desired 

orthonormal system. □ 

As an immediate application of this result we can show that each mixed state 
p e 23* (of the quantum system 23(5{)) can be regarded as a pure state on a 
larger Hilbert space 'K^'K' . We just have to consider the eigenvalue expansion p — 
X]j '^j\4'j){4'j\ of P and t o choo se an arbitrary orthonormal system ipj, j = 1, . . .n 



in Using Proposition 2.2.1 we get 



Corollary 2.2.2 Each state p G 23*(J{) can be extended to a pure state ^E* on a 
larger system with Hilbert space J{ §5 ^K' such that trj<;' = p holds. 

2.2.2. Compound and hybrid systems. — To discuss the composition of two 
arbitrary (i.e. classical or quantum) systems it is very convenient to use the scheme 
developed in Subsection ^.l.l| and to talk about the two subsystems in terms of 
their observable algebras A C 55 (5{) and 23 C 23(3<C). The observable algebra of the 
composite system is then simply given by the tensor product of A and 23, i.e. 

A(g)'B -.^ span{A (g)B\AeA, B e T,} c S(aC (g) (2.9) 

The dual of yi (g) 23 is generated by product states, {p ^ (j){A ^ B) — p{A)a{B) and 
we therefore write A* (g) 23* for {A (Ki 23)*. 

The interpretation of the composed system yi (g) 23 in terms of states and effects 
is straightforward and therefore postponed to the next Subsection. We will consider 
first the special cases arising from different choices for A and 13. If both systems are 
quantum {A = S(3<) and 3 = ^(X)) we get 

S(J{)®S(2C) =S(J{®2C) (2.10) 

as expected. For two classical systems A = G{X) and 23 = C(^) recall that elements 
of G{X) (respectively C{Y)) are complex valued functions on X (on Y). Hence the 
tensor product C{X) (g) C(F) consists of complex valued functions on X x Y, i.e. 
G{X) (g) C(F) = C{X X y). In other words states and observables of the composite 
system C{X) (g C{Y) are, in accordance with classical probability theory, given by 
probability distributions and random variables on the Cartesian product X xY. 
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If only one subsystem is classical and the other is quantum; e.g. a micro particle 
interacting with a classical measuring device we have a hybrid system. The elements 
of its observable algebra C(X) ® !B(!K) can be regarded as operator valued functions 
on X, i.e. X 3 x i-^ e 'B{JC) and A is an effect iff < < I holds for aU 
X e X. The elements of the dual e*{X) (g) 'B*{'K) are in a similar way S* {X) valued 
functions X 5 x i-^ px ^ and p is a state iff each px is a positive trace class 

operator on and J2x Px — I- The probability to measure the effect A in the state 
P is J2xPx{Ax)- 

2.2.3. Correlations and entanglement. — Let us now consider two effects 
A G A and Be® then A ^ B is an effect of the composite system 71 (g) S. It 
is interpreted as the joint measurement of A on the first and B on the second 
subsystem, where the "yes" outcome means "both effects give yes". In particular 
A ^ TL means to measure A on the first subsystem and to ignore the second one 
completely. If p is a state of A^'B we can define its restrictions by p'^{A) = p{A(^T) 
and p'^{A) = p(I ^ A). If both systems are quantum the restrictions of p are the 
partial traces, while in the classical case we have to sum over the H, respectively 
A variables. For two states pi G §{A) and p2 G §('B) there is always a state p of 
AC^'B such that pi = p-^ and p2 = p^ holds: We just have to choose the product 
state p\ ® Pi- However in general we have p ^ p'^ ® p^ which means nothing else 
then p also contains correlations between the two subsystems systems. 

Definition 2.2.3 A state p of a bipartite system A^H is called correlated if there 
are some AeA,Be'B such that p{A (g) B) p-^{A)p'^{B) holds. 

We immediately see that p = pi ® P2 implies p{A ® B) — pi{A)p2[B) = 
p-^{A)p'^{B) hence p is not correlated. If on the other hand p{A®B) = p-^{A)p'^{B) 
holds we get p — p-'^®p^. Hence, the definition of correlations just given perfectly 
fits into our intuitive considerations. 

An important issue in quantum information theory is the comparison of correla- 
tions between quantum systems on the one hand and classical systems on the other. 
Hence let us have a closer look on the state space of a system consisting of at least 
one classical subsystem. 

Proposition 2.2.4 Each state p of a composite system A^H consisting of a clas- 
sical (A = Q{X)) and an arbitrary system (H) has the form 



,pf®pf (2.11) 



with positive weights \j > and G S{A), p"^ G §(23). 

Proof. Since A — &{X) is classical, there is a basis G 71, j G X of mutually 

orthogonal one-dimensional projectors and we can write each A G 71 as 'Y^- aj\j){j\ 
(cf. Subsection 2.1.3| ). For each state p G §(7l ® 23) we can now define pf G §(7l) 



with pf{A) = tr(A|j)(j|) = a, and pf G §(23) with pJ{B) = XJ^pi]3){j\ ® B) and 

Aj — p(li)(j| ® I)- Hence we get p — X^jex ^iPf ® pj ^'^^^ positive Aj as stated. 
□ 

If A and 23 are two quantum systems it is still possible for them to be correlated 
in the way just described. We can simply prepare them with a classical random 
generator which triggers two preparation devices to produce systems in the states 
pf,pf with probability Aj. The overall state produced by this setup is obviously 



the p from Equation (2.11). However, the crucial point is that not all correlations of 
quantum systems are of this type! This is an immediate consequence of the definition 
of pure states p — |^')(^| G §(IK): Since there is no proper convex decomposition of 
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p, it can be written as in Proposition 2.2.4 iff 5* is a product vector, i.e. ^ = (j)(g)^ 



This observation motivates the foUowing definition. 

Definition 2.2.5 A state p of the composite system 23 (g) 23(3^2) is called sep- 
arable or classically correlated if it can be written as 



'^^^^pf (2.12) 



with states p^'^^ o/15(Jffc) and weights Xj > 0. Otherwise p is called entangled. The 
set of all separable states is denoted by 2)(!Ki IH2) or just D if CKi and are 
understood. 

2.2 A. Bell inequalities. — We have just seen that it is quite easy for pure states 
to check whether they are entangled or not. In the mixed case however this is a much 
bigger, and in general unsolved, problem. In this subsection we will have a short 
look at Bell inequalities, whi ch ar e maybe the oldest criterion for entanglement (for 



a more detailed review see [164]). Today more powerful methods, most of them 
based on positivity properties, are available. We will postpone the corresponding 
discussion to the end of t he f ollowing section, after we have studied (completely) 



positive maps (cf. Section 2.4 ) 



Bell inequalities are traditionally discussed in the framework of "local hidden 
variable theories". More precisely we will say that a state p of a bipartite system 
!B(!K(8) 3C) admits a hidden variable model, if there is a probability space (X, p) and 
(measurable) response functions X 3 x Fa{x, k), Fb{x, I) eM. for all discrete PV 
measures A — Ai, . . . , An E 13 (J{) respectively B — Bi, . . . ,Bm G 23(30) such that 

FA{x,k)FB{x,l)p{dx) ^ tr{pAk <E) Bi) (2.13) 

X 

holds for all, k,l and A,B. The value of the functions FA{x,k) is interpreted as 
the probability to get the value k during an A measurement with known "hidden 
parameter" x. The set of states admitting a hidden variable model is a convex set 
and as such it can be described by an (infinite) hierarchy of correlation inequalities. 
Any one of these inequalities is usually called (generalized) Bell inequality. The 
most well known one is those given by Clauser, Home, Shiniony and Holt | |47| : The 
state p satisfies the CHSH-inequality if 

p{A(g>{B + B') + A'(g>{B-B'))<2 (2.14) 

holds for aU A, A' e !B(3{) respectively B,B' e S(3C), with -1 < A,A' < 1 and 
— I < B,B' < I. For the special case of two dichotomic observables the CHSH 
inequalities are sufficient to characterize the states with a hidden variable model. In 
the general case the CHSH-inequalities are a necessary but not a sufficient condition 
and a complete characterization is not known. 

It is now easy to see that each separable state p — X]J=i ^jP'f' ® pf^ 
mits a hidden variable model: we have to choose X — 1,... ,n, /i({j}) — Xj, 
FA{x,k) — p'^}\Ak) and Fb analogously. Hence we immediately see that each 
state of a composite system with at least one classical subsystem satisfies the Bell 
inequalities (in particular the CHSH version) while this is not the case for pure 
quantum systems. The most prominent examples are "maximally entangled states" 



(cf. Subsection 3.1.1) which violate the CHSH inequality (for appropriately chosen 
A, A' , B, B') with a maximal value of 2^/2. This observation is the starting point 
for many discussions concerning the interpretation of quantum mechanics, in par- 
ticular because the maximal violation of 2^/2 was observed in 1982 ex peri mentally 
by Aspect and coworkers § . We do not want to follow this path (see |l64| and the 
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the references therein instead). Interesting for us is the fact that Bell inequalities, 



in particular the CHSH case in Equation (2.14), provide a necessary condition for 



a state p to be separable. However there exist entangled states admitting a hidden 



variable model |16(;]. Hence, Bell inequalities are not sufficient for separability. 

2.3 Channels 

Assume now that we have a number of quantum systems, e.g. a string of ions in 
a trap. To "process" the quantum information they carry we have to perform in 
general many steps of a quite different nature. Typical examples are: free time 
evolution, controlled time evolution (e.g. the application of a "quantum gate" in a 
quantum computer), preparations and measurements. The purpose of this section is 
to provide a unified framework for the description of all these different operations. 
The basic idea is to represent each processing step by a "channel" , which converts 
input systems, described by an observable algebra A into output systems described 
by a possibly different algebra S. Henceforth we will call A the input and S the 
output algebra. If we consider e.g. the free time evolution, we need quantum systems 
of the same type on the input and the output side, hence in this case we have 
A = "B = 23 (5{) with an appropriately chosen Hilbert space IK. If on the other hand 
we want to describe a measurement we have to map quantum systems (the measured 
system) to classical information (the measuring result). Therefore we need in this 
example A — 23 (Jf) for the input and 23 — C{X) for the output a lgebra , where X is 



the set of possible outcomes of the measurement (cf. Subsection |2.1.4| ) . 

Our aim is now to get a mathematical object which can be used to describe a 
channel. To this end consider an effect ^ e 23 of the output system. If we invoke first 
a channel which transforms A systems into 23 systems, and measure A afterwards 
on the output systems, we end up with a measurement of an effect T(A) on the 
input systems. Hence we get a map T : £(23) —^ 8,{A) which completely describes the 
channel^. Alternatively we can look at the states and interpret a channel as a map 
T* : §{A) §(23) which transforms A systems in the state p G §{A) into 23 systems 
in the state T*{p). To distinguish between both maps we can say that T describes 
the channel in the Heisenberg picture and T* in the Schrodinger picture. On the level 
of the statistical interpretation both points of view should coincide of course, i.e. the 
probabilities {T*p){A) and p{TA) to get the result "yes" during an A measurement 
on 23 systems in the state T*p, respectively a TA measurement on A systems in 
the state p, should be the same. Since {T*p){A) is linear in A we see immediately 
that T must be an affine map, i.e. T{XiAi + X2A2) = AiT(^i) + X2T{A2) for each 
convex linear combination Ai^i + A2^2 of effects in 23, and this in turn implies that 
T can be extended naturally to a linear map, which we will identify in the following 
with the channel itself, i.e. we say that T is the channel. 

2.3.1. Completely positive maps. — Let us change now slightly our point of 
view and start with a linear operator T : yi — > 23. To be a channel, T must map 
effects to effects, i.e. T has to be positive: T{A) > VA > and bounded from 
above by I, i.e. T(1I) < I. In addition it is natural to require that two channels 
in parallel are again a channel. More precisely, if two channels T : Ai Hi and 
S : A2 ^ 'B>2 are given we can consider the map T ® S which associates to each 
A® B £ Ai® A2 the tensor product T{A) ® S{B) G Si (g) 'B2. It is natural to 
assume that T (g) S" is a channel which converts composite systems of type Ai (E)A2 



into Til ® 232 systems. Hence S ®T should be positive as well |12[] 



*Note that the direction of the mapping arrow is reversed compared to the natural ordering of 
processing. 

^To keep notations more readable we will follow frequently the usual convention to drop the 
parenthesis around arguments of linear operators. Hence we will write TA and T*p instead of 
T{A) and T*{p). Similarly we will simply write TS instead of T o S for compositions. 
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Definition 2.3.1 Consider two observable algebras A, 23 and a linear map T : 

1. T is called positive ifT[A) > holds for all positive A £ A. 

2. T is called completely positive ('cp; z/ T (g) Id : ^ (g) S(C") ^ 'B(M) ® S(C") 
is positive for all n g N. Here Id denotes the identity map on 23(0"). 

3. T is called unital ifT{l) — I holds. 

Consider now the map T* : S* A* which is dual to T, i.e. T*p{A) = p{TA) 
for all p e 23* and A ^ A. It is called the Schrodinger picture representation of the 
channel T, since it maps states to states provided T is unital. (Complete) positivity 
can be defined in the Schrodinger picture as in the Heisenberg picture and we 
immediately see that T is (completely) positive iff T* is. 

It is natural to ask whether the distinction between positivity and complete 
positivity is really necessary, i.e. whether there are positive maps which are not 
completely positive. If at least one of the algebras yi or 23 is classical the answer 
is no: each positive map is completely positive in this case. If both algebras are 
quantum however complete positivity is not implied by positivity alone. We will 



discuss explicit examples in Subsection 2.4.2. 

If item ^ holds only for a fixed n G N the map T is called n-positive. This is 
obviously a weaker condition then complete positivity. However, n-positivity implies 
TO-positivity for all m < n, and for A — 'B{C'^) complete positivity is implied by 
n-positivity, provided n > d holds. 

Let us consider now the question whether a channel should be unital or not. We 
have already mentioned that r(I) < I must hold since effects should be mapped to 
effects. If r(I) is not equal to I we get p{Tl) = T*p{l) < 1 for the probabihty to 
measure the effect I on systems in the state T*p, but this is impossible for channels 
which produce an output with certainty, because I is the effect which is always true. 
In other words: If a cp map is not unital it describes a channel which sometimes 
produces no output at all and r(I) is the effect which measures whether we have 
got an output. We will assume in the future that channels are unital if nothing else 
is explicitly stated. 

2.3.2. The Stinespring theorem. — Consider now channels between quantum 
systems, i.e. A — 23(!Ki) and 23 = 23(1K2). A fairly simple example (not necessarily 
unital) is given in terms of an operator V : J{i ^ 5{2 by 23(J{i) 3 A i-i- VAV* £ 
23(!K2). A second example is the restriction to a subsystem, which is given in the 
Heisenberg picture by 'B(J{) 9 A i-^ A (g Igc G 23(5{ (g) 3C). Finally the composition 
SoT = ST of two channels is again a channel. The following theorem, which is the 
most fundamental structural result about cp maps^, says that each channel can be 
represented as a composition of these two examples [|147[ . 



Theorem 2.3.2 (Stinespring dilation theorem) Every completely positive 
map T : ^(^{i) ^ ^(Jfa) has the form 

T{A)^V*{A^loc)V, (2.15) 

with an additional Hilbert space 3C and an operator V : !K2 !Ki (g 3C. Both (i.e. 
3C and V) can be chosen such that the span of all {A g) 'S)V4> with A € 23(!Ki) 
and (j) G H2 is dense in 5fi g) %. This particular decomposition is unique (up to 
unitary equivalence) and called the minimal decomposition. //dimJfi = di and 
dim Jf2 = d2 the minimal % satisfies dimJC < d\d2. 

® Basically there is a more general version of this theorem which works with arbitrary output 
algebras. It needs howe yer anmp material from representation theory of C*-algebras which we want 
to avoid here. See e.g. |l^ 
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By introducing a family |Xj)(Xjl of dimensional projectors with 

'l2j \Xj){Xj\ = I we can define the "Kraus operators" {ip,Vj(l)) = {ip 1^ Xj^^'P)- 



In terms of them we can rewrite Equation (2.15) in the following form |10 



Corollary 2.3.3 (Kraus form) Every completely positive map T : 23(J{i) 
23(3^2) can be written in the form 

N 

i=i 

with operators Vj : ^ 3ii and N < dim(Jfi) dini(!K2). 

2.3.3. The duality lemma. — We will consider a fundamental relation between 
positive maps and bipartite systems, which will allow us later on to translate prop- 
erties of entangled states to properties of channels and vice versa. The basic idea 
originates from elementary linear algebra: A bilinear form on a d-dimensional 
vector space V can be represented by a d x d-matrix, just as an operator on V. 
Hence, we can transform into an operator simply by reinterpreting the matrix el- 
ements. In our situation things are more difficult, because the positivity constraints 
for states and channels should match up in the right way. Nevertheless we have the 
following theorem. 

Theorem 2.3.4 Let p be a density operator on IK (g) !Ki. Then there is a Hilbert 
space 3C a pure state a on !K (g) 3C and a channel T : !B(J{i) 23 (3C) with 

p={ld®T*)a, (2.17) 

where Id denotes the identity map on !B*(J{). The pure state <j can be chosen such 
that tvycia) has no zero eigenvalue. In this case T and a are uniquely determined 

(up to unitary equivalence) by Equation (2.17); i.e. if a, T with p = ^Id^T*^ a are 



given, we have cr = (I C/)*ct(I ® U) and T{-) — U*T{ ■ )U with an appropriate 
unitary operator U . 

Proof. The state a is obviously the purification of tr^-^ (p) . Hence if Xj and 
ijjj are eigenvalues and eigenvectors of tr^^{p) we can set a ~ with 
5* = \f^3'^3 ® 'Pj where (f>j is an (arbitrary) orthonormal basis in %. It is 
clear that a is uniquely determined up to a unitary. Hence we only have to show 
that a unique T exists if is given. To satisfy Equation ( ^.17| ) we must have 

p{\^j<»Vk){i^i<»Vi\) = (*,(Id(8)r)(|^^j ®77fe)(V'i«)'7i|)*) (2-18) 
= (vI/,|V,)(V'd®7^(hfc>(^pl)*) (2.19) 
= ^/x;xi{cb,,T{\r^k){Vp\)^i), (2.20) 

where rjk is an (arbitrary) orthonormal basis in !Ki . Hence T is uniquely determined 
by p in terms of its matrix elements and we only have to check complete positivity. 
To this end it is useful to note that the map p l—^ T is linear if the Xj are fixed. 
Hence it is sufficient to consider the case p — |x)(a:|. Inserting this in Equation 
( I2.20I ) we immediately see that T{A) = V*AV with {V4>j,r]k) = >^J^^^{->P] ® r/fc,x) 



holds. Hence T is completely positive. Since normalization T(I) = I follows from 
the choice of the Xj the theorem is proved. □ 

2.4 Separability criteria and positive maps 



We have already stated in Subsection 2.3.1 that positive but not completely pos 



itive maps exist, whenever input and output algebra are quantum. No such map 
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represents a valid quantum operation, nevertheless they are of great importance in 
quantum information theory, due to their deep relations to entanglement properties. 
Hence, this Section is a continuation of the study of separability criteria which we 



have started in 2.2.4. In contrast to the rest of this section, all maps are considered 
in the Schrodinger rather than in the Heisenberg picture. 

2.4.1. Positivity. — Let us consider now an arbitrary positive, but not necessar- 
ily completely positive map T* : 'B*(J{) 25* (3C). If Id again denotes the identity 
map, it is easy to see that {ld(^T*){a2 (^2) — cri ® T*[a2) > holds for each 
product state cti (g) 0-2 € %{'K®%). Hence {Id ®T*)p >Q for each positive T* is 
a necessary condition for p to be separable. The following theorem proved in ||8^ 
shows that sufhciency holds as well. 

Theorem 2.4.1 A state p e 'B*(Jf (g) 3C) is separable iff for any positive map T* : 
'B*(X) 'B*(JC) the operator (ld®T*)p is positive. 



Proof. We will only give a sketch of the proof see |86 for details. The condition is 
obviously necessary since {ld(^T*)pi ^ p2>0 holds for any product state provided 
T* is positive. The proof of sufficiency relies on the fact that it is always possible 
to separate a point p (an entangled state) from a convex set T) (the set of separable 
states) by a hyperplane. A precise formulation of this idea leads to the following 
proposition. 

Proposition 2.4.2 For any entangled state p G §(Jf (g) 3C) there is an operator 
A on 5{ (g 3C called entanglement witness for p, with the property p{A) < and 
(^(A) > for all separable a e S(Jf g) K). 

Proof. Since D C S* (J{g)DC) is a closed convex set, for each p e 8 C 25* ( g)3C) with 
p ^T) there exists a linear functional a on 'B*(3-C g) 3C), such that a{p) < 7 < a{<j) 
for each cr e D with a constant 7. This holds as well in infinite dime nsional Banach 



spaces and is a consequence of the Hahn-Banach theorem (cf. |135| Theorem 3.4). 
Without loss of generality we can assume that 7 = holds. Otherwise we just have 
to replace a by a — 7 tr. Hence the result follows from the fact that each linear 
functional on S*(J{ g) %) has the form a{a) = tr(A(T) with A e ^(JC g) X). □ 



To continue the proof of Theorem ^.4.1 associate now to any operator A e 



^(JC (g 3C) the map : S*(3C) ^ '£,*{%) with 

tr(Api g) P2) = tr(pf Tl(p2)), (2.21) 

where ( • denotes the transposition in an arbitrary but fixed orthonormal basis 
j = 1, . . . , d. It is easy to see that T\ is positive if tr(Api (g P2) > for all 
product states pi® p2 G §(Mg)!}C) jo^. A straightforward calculation j86| shows in 
addition that 

tv{Ap) = tv{\^)mid®TX){p)) (2.22) 

holds, where ^ = c?~-^/^ \ j) g) Assume now that (Id ® T*)p > for all positive 
T*. Since T\ is positive this implies that the left hand site of ( ^.22| ) is positive, hence 
tr(^p) > provi ded tr (Ag) > holds for all separable tr, and the statement follows 
from Proposition 2.4.2. □ 



2.4.2. The partial transpose. — The most typical example for a positive non- 
cp map is the transposition QA = of d x d matrices, which we have just used in 



the proof of Theorem 2.4.1. is obviously a positive map, but the partial transpose 



'B*{-K®%)3 p^ {Id ®Q) (p) e S* ( M g) X) (2.23) 
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is not. The latter can be easily checked with the maximally entangled state (cf. 
Subsection [3.1.1| ). 

* = 7^EI^')®|J') (2.24) 
i 

where |j) g C'', j — 1, . . . ,d denote the canonical basis vectors. In low dimensions 
the transposition is basically the only positive map which is not cp. Due to results 



of St0rmer |148 and Woronowicz |l74| we have: dim!K — 2 and dimSC = 2, 3 imply 
that each positive map T* : !B*(M) 'B*(3C) has the form T* = + T^Q with 
two cp maps Tj*,T2* and the transposition on !B(!}{). This immediately implies that 
positivity of the partial transpose is necessary and sufficient for separability of a 
state p G §(J{® 3C) (cf. @): 

Theorem 2.4.3 Consider a bipartite system !B(JC (g) 3C) with dimJ{ = 2 and 
dimJC = 2,3. A state p G ${'K ® %) is separable iff its partial transpose is pos- 
itive. 

To use positivity of the partial transpose as a separability criterion was proposed 



for the first time by Peres |127|, and he conjectured that it is a necessary and 



sufficient condition in arbitrary finite dimension. Although it has turned out in the 



meantime that this conjecture is wrong in general (cf. Subsection 3.1.5 ), partial 



transposition has become a crucial tool within entanglement theory and we define: 

Definition 2.4.4 A state p G 'B*(IK (g) 3C) of a bipartite quantum system is called 
ppt-state if (Id(E)0)/9 > holds and npt-state otherwise (ppt= "positive partial 
transpose" and npt= "negative partial transpose''). 

2.4.3. The reduction criterion. — Another frequently used example of a non- 
cp but positive map is 25* (5{) 3 p ^ T*{p) = (trp)I - p G 23* (J{). The eigenvalues 
of T*{p) are given by ti p — A^, where A,; are the eigenvalues of p. If p > we 
have Ai > and therefore J2j ~ > 0. Hence T* is positive. That T* is not 
completely positive follows if we consider again the example from Equation 



(2.24), hence we get 

I(g)tr2(p) - p > 0, tri(/9)(g)I-/9>0 (2.25) 

for any separable state p G 23*(!K0 3C), These equations are another non-trivial 
separability criterion, which is called the reduction criterion E2|. It is closely 
related to the ppt criterion, due to the following proposition (see [^5[) for a proof). 

Proposition 2.4.5 Each ppt-state p€ §(IK(8>3C) satisfies the reduction criterion. 
//dim!K — 2 and dimSC = 2,3 both criteria are equivalent. 



Hence we see with Theorem 2.4.3 that a state p in 2x2 or 2x3 dimensions is 



separable iff it satisfies the reduction criterion. 



Chapter 3 
Basic examples 



After the somewhat abstract discussion in the last chapter we will become more 
concrete now. In the following we will present a number of examples which help 
on the one hand to understand the structures just introduced, and which are of 
fundamental importance within quantum information on the other. 

3.1 Entanglement 

Although our definition of entanglement (Definition 2.2.5| ) is applicable in arbitrary 



dimensions, detailed knowledge about entangled states is available only for low 
dimensional systems or for states with very special properties. In this section we 
will discuss some of the most basic examples. 

3.1.1. Maximally entangled states. — Let us start with a look on pure states 
of a composite systems A ® Ti and their possible correlations. If one subsystem is 
classi cal, i.e. A ~ C({1, . • • the state space is given according to Subsection 



2.2.2 



by §{-BY and p e S(S)'^ is pure iff p = {Sjit, ... , S^ar) with j = 1,... ,d 
and a pure state r of the S system. Hence the restrictions of p to ^1 respectively 25 
are the Dirac measure Sj G §{X) or r G S(53), in other words both restrictions are 
pure. This is completely different if A and S are quantum, i.e. A^H = ^(IK (g) 3C): 
Consider p = |^)(^| with ^E* e ?{ (8) 3C and Schmidt decomposition (Proposition 



2.2. ID 5" = J2j 't>3 ® "4^3- Calculating the A restriction, i.e. the partial trace over 



% we get 



tr[tr3c(p)A] = tr[|*)(^|A® I] =Y,\]'^\]J^ {4>j,A4>k)5,k. (3.1) 

3 k 



hence tr3c(p) — '^j^j\<t>j){4>j\ is mixed iff ^ is entangled. The most extreme case 
arises ii'K = % = and tr^{p) is maximally mixed, i.e. tr3c(p) = 2 - ^"-"^ ^ 

1 

^= Y^cj^j^i;, (3.2) 

.7 = 1 



with two orthonormal bases 01, . . . ,(l)d and "01, . . . , 0^- In 2n x 2n dimensions these 
states violate maximally the CHSH inequalities, with appropriately chosen opera- 
tors A^A'^B^B'. Such states are therefore called maximally entangled. The most 
prominent examples of maximally entangled states are the four "Bell states" for 
two qubit systems, i.e. "K ~% ~ C'^, |1), |0) denotes the canonical basis and 

$o-^(|ll) + |00)), $^^,;(i0a,)$o, J = 1,2,3 (3.3) 

where we have used the shorthand notation \ik) for |j) ® \k) and the aj denote the 
Pauli matrices. 

The Bell states, which form an orthonormal basis of C^(K)C^, are the best studied 
and most relevant examples of entangled states within quantum information. A 
mixture of them, i.e. a density matrix p G §(C^ (g) C^) with eigenvectors and 
eigenvalues < Aj < 1, \j = 1 is called a Bell diagonal state. It can be shown 
p6[ that p is entangled iff maxj > 1/2 holds. We omit the proof of this statement 
here, but we will come back to this point in Chapter ^ within the discussion of 
entanglement measures. 
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Let us come back to the general case now and consider an arbitrary p € §(J{(8)3i). 
Using maximally entangled states, we can introduce another separability criterion 
in terms of the maximally entangled fraction (cf. p^) 

^p)^ sup (3.4) 

max. cnt. 



If p is separable the reduction criterion ( ^.25 ) implies [tri(p) I — pj^*) > for 



any maximally entangled state. Since the partial trace of |^) (\E'| is d we get 

d-i ^ (*,tri(p)® I*) < (3.5) 

hence 3^{p) < 1/d. This condition is not very sharp however. Using the ppt criterion 
it can be shown that p = A|$i)($i| + (1 - A)|00)(00| (with the Bell state $i) is 
entangled for all < A < 1 but a straightforward calculation shows that 3^{p) < 1/2 
holds for A < 1/2. 

Finally, we have to mention here a very useful parameterization of the set of 
pure states on !K ® IK in terms of maximally entangled states: If ^' is an arbitrary 
but fixed maximally entangled state, each (/> € !K(8)IK admits (uniquely determined) 
operators Xi , X2 such that 

(/) = (Xi ® I)* = (I® X2)* (3.6) 
holds. This can be easily checked in a product basis. 

3.1.2. Werner states. — If we consider entanglement of mixed states rather 
than pure ones, the analysis becomes quite difficult, even if the dimensions of the 
underlying Hilbert spaces are low. The reason is that the state space S(!Ki (E)'K2) of a 
two-partite system with dimIKi = is a geometric object in a dfd2 — 1 dimensional 
space. Hence even in the simplest non-trivial case (two qubits) the dimension of the 
state space becomes very high (15 dimensions) and naive geometric intuition can 
be misleading. Therefore it is often useful to look at special classes of model states, 
which can be characterized by only few parameters. A quite powerful tool is the 
study of symmetry properties; i.e. to investigate the set of states which is invariant 
under a group of local unitaries. A general discussion of this scheme can be found 
in |159| . In this paper we will present only three of the most prominent examples. 

Consider first a state p g §{'K (E) 0-C) (with Jf = C^) which is invariant under the 
group of all U ^ U with a unitar y U onJC; i.e. [U ^ U, p] — for all U. Such a p 
is usually called a Werner state |l66|, |128| and its structure can be analyzed quite 



easily using a well known resu lt of group theory which goes back to Weyl |171| (see 
also Theorem IX. 11. 5 of [ [142| ), and which we will state in detail for later reference: 

Theorem 3.1.1 Each operator A on the N-fold tensor product of the (finite 

dimensional) Hilbert space "K which commutes with all unitaries of the form JJ®-^ 
is a linear combination of permutation operators, i.e. A = A^y^r; where the .sum 
is taken over all permutations t: of N elements, A^ G C and is defined by 

V^(j)l (g) • • • (g) 0^, = (f>^-i(i) (g) • • • (g) 0^-i(Ar). (3.7) 

In our case {N = 2) there are only two permutations: the identity I and the flip 
F{tp g) 0) = </) g) -0- Hence p = al + bF with appropriate coefRcients a, b. Since p is 
a density matrix, a and b are not independent. To get a transparent way to express 
these constraints, it is reasonable to consider the eigenprojections P± of F rather 
then I and F; i.e. FP±^ = ±P±i^ and P± = (I±F)/2. The P± are the projections 
on the subspaces !Kf ^ C g) !K of symmetric respectively antisymmetric tensor 
products (Bose- respectively Fermi-subspace) . If we write d± ~ d{dzt l)/2 for the 
dimensions of !K^^ we get for each Werner state p 

P=^P++^-^^P-' AG [0,1]. (3.8) 



3. Basic examples 



26 



On the other hand it is obvious that each state of this form is U iSi U invariant, 
hence a Werner state. 

If p is given, it is very easy to calculate the parameter A from the expectation 
value of p and the flip tr(pF) = 2A — 1 e [—1,1]. Therefore we can write for an 
arbitrary state a £ §{'K (E) 3^) 

, , tr(cri^) + 1 (1 - trcri^) , , 

and this defines a projection from the full state space to the set of Werner states 
which is called the twirl operation. In many cases it is quite useful that it can be 
written alternatively as a group average of the form 

Puv{cr) / {U(E U)a{U* ® U*)dU, (3.10) 

where dU denotes the normalized, left invariant Haar measure on \J{d). To check 
this identity note first that its right hand side is indeed U ®U invariant, due to the 
invariancc of the volume element dlJ . Hence we have to check only that the trace 
of F times the integral coincides with tr(i^CT): 



tr 



F / {U ®U)a{U* ®U*)dU 

U(d) 



tT:[F{U ®U)(j{U* ®U*)]dU (3.11) 

U(d) 



= tr(i^CT) / dC/ = tr(Fcr), (3.12) 

where we have used the fact that F commutes with U ®U and the normalization of 
dU. We can apply Puu obviously to arbitrary operators A G 'S>{'K®'K) and, as an 
integral over unitarily implemented operations, we get a channel. Substituting U — > 
U* in ( |3.10| ) and cycling the trace tr(APuu(cr)) we find tr(Puu(-4)p) = tr(APuu(p)), 



hence Puu has the same form in the Heisenberg and the Schrodinger picture (i.e. 
Puu = ^^uu). 

If (T e 2>{'K (8) IK) is a separable state the integrand of P\j\]{<j) in Equation ( 3.1(]| ) 



consists entirely of separable states, hence Puu(o') is separable. Since each Werner 
state p is the twirl of itself, we see that p is separable iff it is the twirl Puu(o') of 
a separable state a S §(IK ®'K). To determine the set of separable Werner states 
we therefore have to calculate only the set of all tr(P(T) e [—1,1] with separable 
a. Since each such a admits a convex decomposition into pure product states it is 
sufficient to look at 

(V®0,P7/;®0) = |(Va0)|2 (3.13) 



which ranges from to 1. Hence p from Equation (3.8) is separable iffl/2<A<l 



and entangled otherwise (due to A = (tr(Pp) + l)/2). If Si = holds, each Werner 



state is Bell diagonal and we recover the result from Subsection 3.1.1 (separable if 
highest eigenvalue less or equal than 1/2). 

3.1.3. Isotropic states. — To derive a second class of states consider the partial 
transpose {lA®Q)p (with respect to a distinguished base |j) g J{, j = 1, . . . , d) of 
a Werner state p. Since p is, by definition, U ® U invariant, it is easy to see that 
(Id(E)6)p is U ® U invariant, where U denotes component wise complex conjugation 
in the base |j) (we just have to use that U* = U'^ holds). Each state r with this kind 



of symmetry is called an isotropic state |132], and our previous discussion shows 



that T is a linear combination of I and the partial transpose of the flip, which is 
the rank one operator 

d 

F = (Id®e)P = |*)(*| ^ J2 (3-14) 

jk=i 
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where ^I' — \ jj) is, up to normalization a maximally entangled state. Hence each 
isotropic r can be written as 



T = -,(x^ + {l-X)F] , Ae 



0, 



d2 - 1 



(3.15) 



where the bounds on A follow from normalization and positivity. As above we can 
determine the parameter A from the expectation value 



tr(FT) 



-X + d 



(3.16) 



which ranges from to d and this again leads to a twirl operation: For an arbitrary 
state a e §(J{ (g) JC) we can define 



1 



[tr(Fcr) -d]l+[l- dtr{Fa)] F I , 



d{l - d2) 

and as for Werner states Pytj can be rewritten in terms of a group average 
Puu((7)=/" {U ®U)a{U* ®U*)dU. 

Ju{d) 



(3.17) 



(3.18) 



Now we can proceed in the same way as above: Putj is a channel with P^^ — Puxj, 
its fixed points Ptji]{t) = t are exactly the isotropic states, and the image of the set 
of separable states under Pyp coincides with the set of separable isotropic states. 
To de termine the latter we have to consider the expectation values (cf. Equation 

iH)) 



(■0 «) 0, F-0 «) 0) = 

This implies that r is separable iff 

d{d - 1) 



= 1(^^,0)1' e [0,1]. 



rf2 - 1 



< A < 



rf2 - 1 



(3.19) 



(3.20) 



holds and entangled otherwise. For A = we recover the maximally entangled state. 
For d = 2, again we recover again the special case of Bell diagonal states encountered 
already in the last subsection. 

3.1.4. OO-invariant states. — Let us combine now Werner states with isotropic 
states, i.e. we look for density matrices p which can be written as p — aTL + bF + cF, 
or, if we introduce the three mutually orthogonal projection operators 

1 



as a convex linear combination of tr(pj) 

Pi 



P = (1 - Ai - X2)po + Ai 



+ A 



Pj^ J 

P2 



tr(pi) tr(p2) ' 



(3.21) 

u 

0,1,2: 

Ai,A2 > 0, Ai + A2 < 1 (3.22) 



Each such operator is invariant under all transformations of the form U ^ U if U 
is a unitary with U — U , in other words: U should be a real orthogonal matrix. 
A little bit representation theory of the orthogonal group shows that in fact all 
operators with this invariance property have the form given in ( p.22| ); cf. |159|. The 
corresponding states are therefore called OO-invariant, and we can apply basically 
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Figure 3.1: State space of 00-invariant states (upper triangle) and its partial trans- 
pose (lower triangle) for d = 3. The special cases of isotropic and Werner states are 
drawn as thin lines. 



the same machinery as in Subsection 3.1.2 if we replace the unitary group V{d) 



by the orthogonal group 0(d). This includes in particular the definition of a twirl 
operation as an average over 0{d) (for an arbitrary p E §(5{ (g) 3^)): 



Poo{p)^ U®UpU®U*dU (3.23) 

Jo{d) 

which we can express alternatively in terms of the expectation values tv[Fp)^ tv{F p) 

by 

tr(fp) 1 ~ tr(f p) ( l+iY{Fp) U{Fp) \ 
The range of allowed values for tr(Fp), tr(Fp) is given by 

. X /- ^ , . 2tT(Fp) 

-1 < tr(Fp) < 1, < tr(Fp) < d, tr(Fp) > ^-^ - 1. (3.25) 



For d = 3 this is the upper triangle in Figure 3.1. 

The values in the lower (dotted) triangle belong to partial transpositions of 
00-invariant states. The intersection of both, i.e. the gray shaded square Q — 
[0, 1] X [0, 1], represents therefore the set of 00-invariant ppt states, and at the same 
time the set of separable states, since each 00-invariant ppt state is separable. To 
see the latter note that separable 00-invariant states form a convex subset of Q. 
Hence, we only have to show that the corners of Q are separable. To do this note 
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that 1. -Poo(p) is separable whenever p is and 2. that tr(i^Poo(/o)) = tr(i^p) and 
tr(FPoo(p)) = tr(Fp) holds (cf. Equation ( 3.12[ )). We can consider pure product 



states \(l)(»ilj){(j)(^ip\ for p and get [\{(j>,ip)\^ , for the tuple (tr(Pp), tr(P/9)) . 

Now the point 1, 1) in Q is obtained if = is real, the point (0,0) is obtained 
for real and orthogonal (f>,tp and the point (1,0) belongs to the case ^ = cj) and 
{(j>, (j)) = 0. Symmetrically we get (0, 1) with the same (j) and -0 = ^. 



3.1.5. PPT states. — We have seen in Theorem 2.4.3 that separable states and 
ppt states coincide in 2 x 2 and 2x3 dimensions. Another class of examples with 
this property are 00-invariant states just studied. Nevertheless, separability and a 
positive partial transpose are not equivalent. An easy way to produce such examples 
of states which are entangled and ppt is given in terms of unextendible product bases 
]l^ . An orthonormal family 4>j G !}{i(8)?£2, j = 1, . . . ,N<did2 (with dk — dim JC^) 
is called an unextendible product basisQ (UPB) iff 1. all 0j are product vectors and 
2. there is no product vector orthogonal to all 4)j. Let us denote the projector to 
the span of all by E, its orthocomplement by E-^, i.e. E-^ = I — i?, and define 
the state p = {did2 — N)~^E'^. It is entangled because there is by construction no 
product vector in the support of p, and it is ppt. The latter can be seen as follows: 
The projector is a sum of the one dimensional projectors \4>j){4>j\, j = 1, . . . ,7V. 
Since all 4>j are product vectors the partial transposes of the \4>j){(j)j \ are of the form 
\(pj) with another UPB (pj, j — 1, . . . ,N and the partial transpose (1(g) Q)E of 
E is the sum of the Hence (I® Q)E^ = I — (I® Q)E is a projector and 

therefore positive. 

To construct entangled ppt states we have to find UPBs. The following two 
examples are taken from . Consider first the five vectors 

0j = 7V(cos(27rj/5),sin(2^j75),/i), j = 0, . . . ,4, (3.26) 



with N = Ij \Jh + V5 and h — \\/\ + a/5- They form the apex of a regular pen- 
tagonal pyramid with height h. The latter is chosen such that nonadjacent vectors 
are orthogonal. It is now easy to show that the five vectors 

*j = 0j ® 4>2jmod5, j = 0, . . . , 4 (3.27) 

form a UPB in the Hilbert space Jf g) J{, dimCK = 3 (cf. [Q). A second example, 
again in 3 x 3 dimensional Hilbert space are the following five vectors (called "Tiles" 
in 11): 

i=|0)®(|0)-|l)), i=|2)®(|l)-|2)), -L(|0)-|l))®|2), 
i=(|l) - |2)) ® |0), i(|0) + |1) + |2)) ® (|0) + |1) + |2)), (3.28) 
where \k), k = 0, 1, 2 denotes the standard basis in J{ = C'^. 

3.1.6. Multipartite states. — In many applications of quantum information 
rather big systems, consisting of a large number of subsystems, occur (e.g. a quan- 
tum register of a quantum computer) and it is necessary to study the corresponding 
correlation and entanglement properties. Since this is a fairly difficult task, there 
is not much known about - much less as in the two-partite case, which we mainly 
consider in this paper. Nevertheless, in this subsection we will give a rough outline 
of some of the most relevant aspects. 

At the level of pure states the most significant difficulty is the lack of an analog 



of the Schmidt decomposition |126 . More precisely there are elements in an A^-fold 



^ This name is somewhat misleading because the <j>j are not a base of W i Jf 2 . 
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tensor product Jf^^^ (8) • • • (8) M'^^^ (with N > 2) which can not be written as^ 

d 

* = ^A,(/.f^®---®0^.'^) (3.29) 



with N orthonormal bases (f)[^^ , ■ • ■ , 0^/''' of !K*^'^-' , fc = 1, . . . , A^. To get examples for 
such states in the tri-partite case, note first that any partial trace of with 'J 

from Equation ( ^.29[ ) has separable eigenvectors. Hence, each purification (Corollary 



2.2.2 ) of an entangled, two-partite, mixed state with inseparable eigenvectors (e.g. 
a Bell diagonal state) does not admit a Schmidt decomposition. This implies on 
the one hand that there are interesting new properties to be discovered, but on 
the other we see that many techniques developed for bipartite pure states can be 
generalized in a straightforward way only for states which are Schmidt decomposable 



in the sense of Equation ( |3.29 ). The most well known representative of this class 



for a tripartite qubit system is the GHZ state ||73^ 

vl/ = i=(|000) + |lll)), (3.30) 

which has the special property that contradictions between local hidden variable 
theories and quantum mechanics occur even for non-statistical predictions (as op- 



posed to maximally entangled states of bipartite systems; |7^, |117| , |llq| ). 

A second new aspect arising in the discussion of multiparty entanglement is the 
fact that several different notions of separability occur. A state p of an A^-partite 
system !B(!Ki) (g) • • • (g) 'B{'Kn) is called N -separable if 

p = ^ Ajpji (8) • • • Pj„ , (3.31) 
J 

with states pj^^ € 'B*(!Hfc) and multi indices J = (ji, . . . , jfc). Alternatively, how- 
ever, we can decompose 'B{'Ki) ® ■ ■ ■ ® 'Si{'Kn) in two subsystems (or even into M 
subsystems if M < N) and call p biseparable if it is separable with respect to this 
decomposition. It is obvious that A'^-separability implies biseparability with respect 
to all possible decompositions. The converse is - not very surprisingly - not true. 
One way to construct a corresponding counterexample is to use an unextendable 



product base (cf. Subsection 3.1.5). In ||lj] it is shown that the tripartite qubit state 
complementary to the UPB 

|0, 1, +), |1, -l-,0), K,0, 1), h, -) with |±) = (|0) ± |1)) (3.32) 

is entangled (i.e. tri-inseparablc) but biseparable with respect to any decomposition 
into two subsystems (cf. |l^ for details). 

Another, maybe more systematic, way to find examples for multipartite states 
with interesting properties is the generalization of the methods used for Werner 



states (Subsection 3.1.2), i.e. to look for density matrices p G 'B*(5f®^) which 
commute with all unitaries of the form C/®^. Applying again theorem [3.1.1| we 
see that each such p is a linear combination of permutation unitaries. Hence the 
structure of the set of all [7®^ invariant states can be derived from representation 
theory of the symmetric group (which can be tedious for large iV!). For = 3 
this program is carried out in [ |6l| and it turns out that the corresponding set of 
invariant states is a five dimensional (real) manifold. We skip the details here and 
refer to Ipll instead. 



9 (k) (k) 

There is however the possibiUty to choose the bases 0^ , . . . ,<^^ such that the number of 
summands becomes minimal. For tri-partite systems this "minimal canonical form" is study in 
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3.2 Channels 



In Section 2.3 we have introduced channels as very general objects transforming 



arbitrary types of information (i.e. classical, quantum and mixtures of them) into 
one another. In the following we will consider some of the most important special 
cases. 

3.2.1. Quantum channnels. — Many tasks of quantum information theory re- 
quire the transmission of quantum information over long distances, using devices 
like optical fibers or storing quantum information in some sort of memory. Both 
situations can be described by a channel or quantum operation T : ^(IK) —^ ^(IK), 
where T*[p) is the quantum information which will be received when p was sent, 
or alternatively: which will be read off the quantum memory when p was written. 
Ideally we would prefer those channels which do not affect the information at all, 
i.e. r = I, or, as the next best choice, a T whose action can be undone by a physi- 
cal device, i.e. T should be invertible and is again a channel. The Stinespring 
Theorem (Theorem 2.3.2) immediately shows that this implies T* p = UpU* with 
a unitary U ; in other words the systems carrying the information do not interact 
with the environment. We will call such a kind of channel an ideal channel. In 
real situations however interaction with the environment, i.e. additional, unobserv- 
able degrees of freedom, can not be avoided. The general structure of such a noisy 
channel is given by 



T*{p)^trjc{U{p(E>po)U* 



(3.33) 



where J7:J{(8)3C^J{(g)3Cisa unitary operator describing the common evolution of 
the system (Hilbert space 3-C) and the environm ent (Hilbert space 3C) and po G §(3C) 
is the initial state of the environment (cf. Figure 3.2). It is obvious that the quantum 
information originally stored in p e §(5{) can not be completely recovered from 
T*(p) if only one system is available. It is an easy consequence of the Stinepspring 
theorem that each channel can be expressed in this form 

Corollary 3.2.1 (Ancilla form) Assume that T : 'B(Jf) 23(J{) is a channel. 
Then there is a Hilbert space 3C, a pure state po and a unitary map U : !H (E" 3C — > 
!K(8'3C such that Equation ( 3. 3k\ ) holds. It is allways possible, to choose OC .such that 
dim(3C) = dim(J{)3 holds. 

Proof. Consider the Stinepspring form T{A) ^V*(A(E) 1)V with V : 3{ J{ (g) X 
of T and choose a vector ip £ X such that U{(j) ip) = V{(j)) can be extended to 
a unitary map U: 3-C^3C^%^3C (this is always possible since T is unital and 
V therefore isometric). If Cj £ j = 1, . . . and fk£X.,k — l,...,d2 are 
orthonormal bases with fi — ip we get 



tY[T{A)p] ^ti[pV*{A<»l)V] = ^(Fpej,(A0 I)T/ej) 

j 

= tr[trjc[U{p<E>\i^)mU*]A 
which proves the statement. 



(3.34) 

(3.35) 

(3.36) 
□ 



Note that there are in general many ways to express a channel this way, e.g. if 
T is an ideal channel p i— > T*p — UpU* we can rewrite it with an arbitrary unitary 
C/q : 3C ^ 3C by T*p = tr2{U (g)UoP'S> poU* '^Uq). This is the we akness of the ancilla 
form compared to the Stinespring representation of Theorem 2.3.2. Nevertheless 
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Figure 3.2: Noisy channel 



Corollary ^.2.1 shows that each channel which is not an ideal channel is noisy in 
the described way. 

The most prominent example for a noisy channel is the depolarizing channel for 
d-level systems (i.e. ^ — C^) 



3 p^dp+ii-d)-e§{J{), o<i?<i 



or in the Heisenberg pictm'e 



tr(^)_ 



(3.37) 



(3.38) 



A Stinespring dilation of T (not the minimal one - this can be checked by counting 
dimensions) is given by3C = J{«)J{®Candy:5{^Jf«)3C = © J{ with 



|J> ^ V\j) = 



1-7? 



k=l 



(3.39) 



where \k) , k — I, . . . , d denotes again the canonical basis in !K. An ancilla form of 
T with the same 3C is given by the (pure) environment state 



Y,\k)^\k) 



k=l 



(3.40) 



(3.41) 



and the unitary operator [/:!K(g)3C^Jf(g)3C with 

U (01 (g) 02 «> 03 © x) = '/>2 «"?!)3 © X, 

i.e. f7 is the direct sum of a permutation unitary and the identity. 

3.2.2. Channels under symmetry. — Similarly to the discussion in Section |3.1 
it is often useful to consider channels with special symmetry properties. To be more 
precise, consider a group G and two unitary representations 7ri,7r2 on the Hilbert 
spaces IKi and !K2 respectively. A channel T : 'B{'Ki) 'B(1K2) is called covariant 
(with respect to tti and 7r2) if 



T[7ri(C/)A7ri(;7)*] = TT2{U)T[A]TT2{Uy VA e S(5{i) VU e G 



(3.42) 



holds. The general structure of covariant channels is governed by a fairly powerful 
variant of Stinesprings theorem which we will state below (and which will be very 
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useful for the study of the cloning problem in Chapter Before we do this let 
us have a short look on a particular class of examples which is closely related to 
00-invariant states. 

Hence consider a channel T : 'B(Jf) — > 'B(IK) which is covariant with respect 
to the orthogonal group, i.e. T{UAU*) ~ UT{A)U* for aU unitaries U on 5{ with 
t/ = [/ in a distinguished basis \ j) , j — 1, . . . , d. The maximally entangled state tp — 
^-1/2 00-invariant, i.e. U = V' for all these U . Therefore each state 

p — (Id(8'T*)|'!/))(^| is 00-invariant as well and by the duality lemma (Theorem 



2.3.4 ) T and ip are uniquely determined (up to unitary equivalence) by p. This 



means we can use the structure of 00-invariant states derived in Subsection 3.1.4 



to characterize all orthogonal covariant channels. As a first step consider the linear 
maps Xi{A) = dtr(A)I, X2{A) = dA^ and X'i{A) = dA. They are not channels 
(they are not unital and X2 is not cp) but they have the correct covariance property 
and it is easy to see that they correspond to the operators I, i^, F S S(J{ Jf), i.e. 



(Id®Xi)|V')(V'| = I, (Id®X2)|V')(^| = (Id (8X3)1^) (V- 1 = F. 



(3.43) 



Using Equation (3.21) we can determine therefore the channels which belong to the 
three extremal 00-invariant states (the corners of the upper triangle in Figure 3.1): 



To(A)=A, ri(A) 



ir{A)\-A^ 



T2{A) 



d{d+l) - 2 



^{tr{A)lL + A^)-A 



(3.44) 
(3.45) 



Each 00-invariant channel is a convex linear combination of these three. Special 
cases are the channels corresponding to Werner and isotropic states. The latter leads 
to depolarizing channels T{A) = + (1 - 'd)d-^ tr(A)I with ■& e [0, d'^/{d'^ - 1)]; 
cf. Equation (3.15), while Werner states correspond to 

^[tr(A)I-f-A^] +i5^[tr(^)I-A^], z?e[0,l]; (3.46) 



TiA) 



cf. Equation (3 



Let us come back now to the general case. We will state here the covariant 
version of the Stinespring theorem (see |98 for a proof). The basic idea is that all 
covariant channels are parameterized by representations on the dilation space. 

Theorem 3.2.2 Let G be a group with finite dimensional unitary representations 
TTj : G U(!Hj) and T : '&['Ki) — > H{'K2) a 111,7:2 - covariant channel. Then 



there is a finite dimensional unitary representation tt : G — : 
V : ^2 'Ki ® % with V'K2{U) = 7ri(J7) ® n{U) and T{A) 



U(3C) and an operator 
= V*A(g)lV. 



To get a n exp licit example consider the dilation of a depolarizing channel given 
in Equation (|3.39|) . In this case we have7ri(J7) = 7r2(t/) = U and tt{U) = (t/(g)C7)®I. 
The check that the map V has indeed the intertwining property Vtt2{U) = 7ri([/) (g) 
7t{U) stated in the theorem is left as an exercise to the reader. 

3.2.3. Classical channels. — The classical analog to a quantum operation is 
a channel T : G{X) C(y) which describes the transmission or manip ulation of 
classical information. As we have mentioned already in Subsection 2.3.1 positivity 
and complete positivity are equivalent in this case. Hence we have to assume only 
that T is positive and unital. Obviously T is characterized by its matrix elements 
Txy ~ ^y{T\x){x\), where 5y G C*{X) denotes the Dirac measure at y G Y and 



|a:;)(2:| G C{X) is the canonical basis in G{X) (cf. Subsection 2.1.3 ). Positivity and 
normalization of T imply that < T^y < 1 and 



Sy{T{l)) ^ Sy [t (EJ^)(^|)] = E/-^ (3.47) 
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holds. Hence the family {Txy)xi^x is a probability distribution on X and T^y is 
therefore the probability to get the information x G X at the output side of the 
channel if y g 1" was send. Each classical channel is uniquely determined by its 
matrix of transition probabilities. For X ^ Y we see that the information is trans- 
mitted without error iff T^y = d^y, i.e. T is an ideal channel if T = Id holds and 
noisy otherwise. 

3.2.4. Observables and Preparations. — Let us consider now a channel which 
transforms quantum information 'B(5{) into classical information G{X). Since posi- 
tivity and complete positivity are again equivalent, we just have to look at a positive 
and unital map E : G{X) ^ CB(5f). With the canonical basis \x){x\, x e X of G{X) 
we get a family — E{\x){x\), x € X of positive operators E^ G 'B{3-C) with 
J2xex ~ ^- Hence the E^ form a POV measure, i.e. an observable. If on the 
other hand a POV measure Ex € 23 (M), x G X is given we can define a quantum 
to classical channel E : e{X) 'B(?{) by E{f) = Y.xfi^)Ex. This shows that 
the observable Ex,x G X and the channel E can be identified and we say E is the 
observable. 

With this interpretation in mind it is possible to have a short look at continuous 
observables without the need of abstract measure theory: We only have to say how 
the classical algebra C(X) is defined for a set X which is not finite or discrete. For 
simplicity we assume that X = R holds, however the generalization to other locally 
compact spaces is straightforward. We choose for C(K) the space of continuous, 
complex valued functions vanishing at infinity, i.e. < e for each e > provided 

\x\ is large enough. e(M) can be equipped with the sup- norm and becomes an 
Abelian C*-algebra (cf. Pq|). To interpret it as an operator algebra as assumed in 



Subsection 2.1.1 we have to identify / € C(K) with the corresponding multiplication 
operator on L (R). An observable taking arbitrary real values can be defined now 
as a positive map E : C(M) CB(!K). The probability to get a result in the interval 
[a, 6] C M during an E measurement on systems in the state p is| 

/i([a,6]) = sup{tr(i;(/)p) 1/ G e(R), < / < I, supp/ C [a,h]} (3.48) 

where supp denotes the support of /. The most well known example for R valued 
observables are of course position Q and momentum P of a free particle in one 
dimension. In this case we have !K = L^(]R) and the channels corresponding to Q 
and P are (in position representation) given by C(M) 9 / i— > Eqlf) e 23 (5f) with 
EqU)^ = respectively e(M) 3 f ^ Ep{f) e S(?{) with Ep{f)i: - {fipy 
where A and V denote the Fourier transform and its inverse. 

Let us return now to a finite set X and exchange the role of C{X) and 23 (3i); in 
other words let us consider a channel R : 'B{3-C) C{X) with a classical input and 
a quantum output algebra. In the Schrodinger picture we get a family of density 
matrices px :— R*{Sx) e 23*(Jf), x e X, where Sx € C*{X) denote again the Dirac 



measures (cf. Subsection 2.1.3). Hence we get a parameter dependent preparation 
which can be used to encode the classical information x X into the quantum 
information px G 23* (JC). 

3.2.5. Instruments and Parameter Dependent Operations. — An observ- 
able describes only the statistics of measuring results, but contains no information 
about the state of the system after the measurement. To get a description which fills 
this gap we have to consider channels which operates on quantum systems and pro- 
duces hybrid systems as output, i.e. T : ^(IK) (g) M(X) 23(3^). Following Davies 
[^o| we will call such an object an instrument. From T we can derive the subchannel 

e{X) 9 / r(I (g) /) e S(3C) (3.49) 



^Due to the Riesz-Markov theorem (cf. Theorem IV. 18 of ]l34| ) the set function /i extends in 
unique way to a probabiUty measure on the real Une. 
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which is the observable measured by T, i.e. tr[T(l(g) |x)(a;|)/3] is the probabihty to 
measure x X on systems in the state p. On the other hand we get for each x ^ X 
a quantum channel (which is not unital) 

S(5f) 3A^ T^{A) = T{A(g)\x){x\) e S(3C). (3.50) 

It describes the operation performed by the instrument T if x € X was measured. 
More precisely if a measurement on systems in the state p gives the result x € X we 
get (up to normalization) the state T* (p) after the measurement (cf . Figure |3.3| ) , 
while 

tr (T;(p)) = tr (T;(p)I) = tr(pT(I ® \x){x\)) (3.51) 

is (again) the probability to measure x € X on p. The instrument T can be expressed 
in terms of the operations by 

T{A^f) = Y,fi^)TAAh (3.52) 

X 

hence we can identify T with the family T^, x e X. Finally we can consider the 
second marginal of T 

•B(J{) 3A^ T{A ^^(^) ^ W- (3-53) 

It describes the operation we get if the outcome of the measurement is ignored. 

The most well known example of an instrument is a von Neumann- Liiders mea- 
surement associated to a PV measure given by family of projections Ex, x = 1, . . . d] 
e.g. the eigenprojections of a selfadjoint operator A G 15 (IK). It is defined as the 
channel 

T : S(J{) (8) e{X) !B(J{) with X = {1,... ,d} and T^iA) = E^AEx, (3.54) 

Hence we get the final state tr{Exp)^^ExpEx if we measure the value x E X on 
systems initially in the state p - this is well known from quantum mechanics. 

Let us change now the role of 'B{D{) (g) G{X) and 23(!K); in other words consider 
a channel T : 'B(3C) — > 'B{3-C) G{X) with hybrid input and quantum output. It 
describes a device which changes the state of a system depending on additional 
classical information. As for an instrument, T decomposes into a family of (uni- 
tal!) channels : 'B(X) S(M) such that we get T*{p®p) = Y.xP^T;{p) in 
the Schrodinger picture. Physically T describes a parameter dependent operation: 
depending on the classical information x £ X th e quantum information p G 23 (DC) 



is transformed by the operation (cf. figure 3.4 ) 



Finally we can consider a channel T : ^(Jf) (g) e{X) S(aC) £{¥) with hybrid 
input and output to get a parameter dependent instrument (cf. figure p.5| ) : Similarly 



p€'B*iX) 

— NAAAAAAAAWWWV> 




-wwwwwwww>- 



xeX 



Figure 3.3: Instrument 
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to the discussion in the last paragraph we can define a family of instruments Ty : 
!B(M) (g) e{X) 'B(aC), y eYhy the equation T*(p®p) = J^yPv^yip)- Physically 
T describes the following device: It receives the classical information y € Y and a 
quantum system in the state p e 23* (3C) as input. Depending on y a measurement 
with the instrument Ty is performed, which in turn produces the measuring value 
X € X and leaves the quantum system in the state (up to normalization) T* ^{p); 
with Ty^x given as in Equation ( |3.50| ) by Ty^^iA) ^Ty{A(d \x){x\). 
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Figure 3.4: Parameter dependent operation 
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Figure 3.5: Parameter dependent instrument 



3.2.6. LOCC and separable channels. — Let us consider now channels acting 
on finite dimensional bipartite systems: T : ®(!Ki (8)3^2) — > 25(3^1 ®%2)- In this case 
we can ask the question whether a channel preserves separability. Simple examples 
are local operations (LO), i.e. T = T^ with two channels T"^'^ : 'B{3<j) 

23 (3Cj ) . Physically we think of such a T in terms of two physicists Alice and Bob both 
performing operations on their own particle but without information transmission 
neither classical nor quantum. The next difficult step are local operations with 
one way classical communications (one way LOCC). This means Alice operates 
on her system with an instrument, communicates the classical measuring result 
j G X = {1, . . . , N} to Bob and he selects an operation depending on these data. 
We can write such a channel as a composition T = {T^ ® ld){ld(gT^) of the 
instrument T'^ : 'B(JCi) G{Xi) 23(!Ki) and the parameter dependent operation 
: T,{3<2) ^ e{Xi) (g> 'B{%2) (cf. Figure U) 



S(3{i ® Jfa) 



S(3{i)® e(X)®'B(3C2) 



!B(D<:i ® DCs). (3.55) 



It is of course possible to continue the chain in Equation ( ^.55[ ), i.e. instead of 
just operating on his system, Bob can invoke a parameter dependent instrument de- 
pending on Alice's data ji € Xi, send the corresponding measuring results j2 G X2 
to Alice and so on. To write down the corresponding chain of maps (as in Equation 
(3.55)) is simple but not very illuminating and therefore omitted; cf. Figure 3.7 in- 
stead. If we allow Alice and Bob to drop some of their particles, i.e. the operations 
they perform need not to be unital, we get a LOCC channel ("local operations and 
classical communications"). It represents the most general physical process which 
can be performed on a two partite system if only classical communication (in both 
directions) is available. 
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3.3. Quantum mechanics in ptiase space 
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Figure 3.6: One way LOCC operation; cf Figure 3.7 for an explanation 
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Figure 3.7: LOCC operation. The upper and lower curly arrows represent Alice's 
respectively Bob's quantum system, while the straight arrows in the middle stand 
for the classical information Alice and Bob exchange. The boxes symbolize the 
channels applied by Alice and Bob. 



LOCC chan nels play a significant role in entanglement theory (we will see this 



in Section 4.3), but they are difficult to handle. Fortunately it is often possible 
to replace them by closely related operations with a more simple structure: A not 
necessarily unital channel T : 'B{3^i (8) 1X2) 23(3^1 (g) 3C2) is called separable, if it 
is a sum of (in general non-unital) local operations, i.e. 

N 

T = ^Tf®Tf. (3.56) 

It is easy to see that a separable T maps separable states to separable states (up 
to normalization) and that each LOCC channel is separable (cf. |Q). The converse 
however is (somewhat surprisingly) not true: there are separable channels which are 
not LOCC, see for a concrete example. 

3.3 Quantum mechanics in phase space 

Up to now we have considered only finite dimensional systems and even in this 
extremely idealized situation it is not easy to get nontrivial results. At a first look 
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the discussion of continuous quantum systems seems therefore to be hopeless. If we 
restrict our attention however to small classes of states and channels, with sufh- 
ciently simple structure, many problems become tractable. Phase space quantum 



mechanics, which will be reviewed in this Section (see Chapter 5 of 1 79 for details), 
provides a very powerful tool in this context. 

Before we start let us add some remarks to the discussion of Chapter ^ which we 
have restricted to finite dimensional Hilbert spaces. Basically most of the material 
considered there can be generalized in a straightforward way, as long as topological 
issues like continuity and convergence arguments are treated carefully enough. There 
are of course some caveats (cf. in particular Footnote || of Chapter ||), however they 
do not lead to problems in the framework we are going to discuss and can therefore 
be ignored. 

3.3.1. Weyl operators and the CCR. — The kinematical structure of a quan- 
tum system with d degrees of freedom is usually described by a separable Hilbert 
space JC and 2d selfadjoint operators Qi, . . . , Qd, Pi, ■ ■ ■ ,Pd satisfying the canon- 



ical commutation relations [Qj,Qk] = 0, [Pj,Pk] 
can be rewritten in a more compact form as 



0, [Qj,Pk] = i(5jfel. The latter 



i?2j-i — QjiRij — Pj, i — Ij ■ • • jd, [Rj,Rk] 
Here a denotes the symplectic matrix 



diag(J, ... , J), J 



1 

-1 



(3.57) 



(3.58) 



which plays a crucial role for the geometry of classical mechanics. We will call 
the pair (V, a) consisting of a and the 2(i-dimensional real vector space V — K^'* 
henceforth the classical phase space. 

The relations in Equation ( 3.57 ) are, however, not sufficient to fix the opera- 
tors Rj up to unitary equivalence. The best way to remove the remaining physical 
ambiguities is the study of the unitaries 



2d 



W{x) = exp(ia:: • a ■ R), x G V, x ■ a ■ R— XjO-jkRk 



(3.59) 



instead of the Rj directly. If the family W{x), x G V is irreducible (i.e. [M^(a;), A] 
0, Va; e y implies A — XI with A G C) and satisfied 



Wix)W{x') 



exp 



--x-a-x' ] W{x + x'), 



(3.60) 



it is called an (irreducible) representation of the Weyl relations (on (V, a)) and the 
operators W{x) are called Weyl operators. By the well known Stone - von Neumann 
uniqueness theorem all these representations are mutually unitarily equivalent, i.e. if 
we have two of them Wi{x), 11^2(2^), there is a unitary operator U with UWi{x)U* — 
W2{x) Vx £ V. This implies that it does not matter from a physical point of view 
which representation we use. The most well known one is of course the Schrodinger 



representation where JC : 
operators. 



\j^{W^) and Qj, are the usual position and momentum 



''Note that the CCR (3.57) are implied by the Weyl relations (3.60) but the converse is, in 
contrast to popular believe, not true: There are rep rese ntations of the CCR which are unitarily 
inequivalent to the Schrodinger representation; cf. |l34| Section VIII. 5 for particular examples. 
Hence uniqueness can only be achieved on the level ot Weyl operators - which is one major reason 
to study them. 
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3.3. Quantum mechanics in ptiase space 



3.3.2. Gaussian states. — A density operator p E §(!K) has finite second mo- 
ments if the expectation values tr(p(5|) and tv^pP^) are finite for aU = 1, . . . , d. 
In this case we can define the mean m G M^'' and the correlation matrix a by 

mj=tT{pR), ajk + icTjk = '2tT[{Rj - mj)p{Rk - mk)]. (3.61) 

The mean m can be arbitrary, but the correlation matrix a must be real and sym- 
metric and the positivity condition 

a + ia>0 (3.62) 
must hold (this is an easy consequence of the canonical commutation relations 



(13.571)). 

Our aim is now to distinguish exactly one state among all others with the same 
mean and correlation matrix. This is the point where the Weyl operators come into 
play. Each state p e §(J{) can be characterized uniquely by its quantum character- 
istic function X 3 x tr[VF(x)p] € C which should be regarded as the quantum 
Fourier transform of p and is in fact the Fourier transform of the Wigner function 



of p 1 165 1 . We call p Gaussian if 

tr[VF(x)p] — exp ^im • x — —x ■ a ■ (3.63) 

holds. By differentiation it is easy to check that p has indeed mean m and covariance 
matrix a. 

The most prominent examples for Gaussian states are the ground state po of a 
system of d harmonic oscillators (where the mean is and a is given by the corre- 
sponding classical Hamiltonian) and its phase space translates pm — W{m)pW{—m) 
(with mean m and the same a as po), which are known from quantum optics as 
coherent states, po and pm are pure states and it can be shown that a Gaussian 
state is pure iff (7~^a — —I holds (see Ch. 5). Examples for mixed Gaussians 
are temperature states of harmonic oscillators. In one degree of freedom this is 

P.-^E(]^) (3-64) 

where |n)(n| denotes the number basis and N is the mean photon number. The 
characteristic function of pjv is 



tT[W{x)pN] — exp 



(3.65) 



and its correlation matrix is simply a — 2{N -\- 1/2)1 

3.3.3. Entangled Gaussians. — Let us consider now bipartite systems. Hence 
the phase space {V, a) decomposes into a direct sum V = Va(BVb (where A stands 
for "Alice" and B for "Bob") and the symplectic matrix a = aA® <^b is block 
diagonal with respect to this decomposition. If Wa{x) respectively Wsiy) denote 
Weyl operators, acting on the Hilbert spaces 5^B, and corresponding to the 
phase spaces Va and Vb, it is easy to see that the tensor product Wa{x) ® Wb{v) 
satisfies the Weyl relations with respect to (V, a). Hence by the Stone - von Neumann 
uniqueness theorem we can identify W{x®y), x®y € Va®VB — V with Wa{x) (g) 
WA{y)- This shows immediately that a state p on ?f = "Ka (8) IKb is a product state 
iff its characteristic function factorizes. Separabilitj;^ is characterized as follows (we 



omit the proof, see [170| instead) 



n infinite dimensions we have to define separable states (in sUght generaUzation to Definition 



2.2.5D as a trace-norm convergent convex sum of product states. 
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Theorem 3.3.1 A Gaussian state with covariance matrix a is separable iff there 
are covariance matrices aA,ctB such that 



a > 



CtA 

as 



(3.66) 



holds. 



This theorem is somewhat similar to Theorem 2.4.1: It provides a useful criterion 
as long as abstract considerations are concerned, but not for explicit calculations. 
In contrast to finite dimensional systems, however, separability of Gaussian states 
can be decided by an operational criterion in terms of nonlinear maps between 



matrices |65 . To state it we have to introduce some terminology first. The key tool 
is a sequence of 2n + 2m x 2ri + 2m matrices a^r, A'^ € N, written in block matrix 
notation as 



UN 



N Cn 



(3.67) 



Given ao the other on are recursively defined by: 

An+1 = Bn+1 ^An- Re{XN) and Cn+i = - Im(XAr) (3.68) 

if aN—ia > and aN+i ~ otherwise. Here we have set Xn — CM{BM~i(JB)^^Cjj 
and the inverse denotes the pseudo invers^ if Bn — i<JB is not invertible. Now we 
can state the following theorem (see |65] for a proof): 

Theorem 3.3.2 Consider a Gaussian state p of a bipartite system with correlation 
matrix ao and the sequence un, iV G N just defined. 

1. If for some N € N we have A^ — iua ^ then p is not separable. 

2. If there is on the other hand an N N such that An — ||Cjv||I — i(7A > 0, 
then the state p is separable (\\Cn\\ denotes the operator norm of Cn)- 

To check whether a Gaussian state p is separable or not we have to iterate 
through the sequence aN until either condition |l| or ^ holds. In the first case we 
know that p is entangled and separable in the second. Hence only the question 
remains whether the whole procedure terminates after a finite number of iterations. 
This problem is treated in and it turns out that the set of p for which separability 
is decidable after a finite number of steps is the complement of a measure zero set 
(in the set of all separable states). Numerical calculations indicate in addition that 
the method converges usually very fast (typically less than five iterations). 

To consider ppt states we first have to characterize the transpose for infinite 
dimensional systems. There are different ways to do that. We will use the fact that 
the adjoint of a matrix can be regarded as transposition followed by componentwise 
complex conjugation. Hence we define for any (possibly unbounded) operator A"^ — 
CA*C, where C : 3-C ^ 'K denotes complex conjugation of the wave function in 
position representation. This implies Qj — Qj for position and Pj" ~ ~Pj for 
momentum operators. If we insert the partial transpose of a bipartite state p into 
Equation ( 3.61 ) we see that the correlation matrix ajk of p^ picks up a minus sign 



whenever one of the indices belongs to one of Alice's momentum operators. To be 
a state a should satisfy 5 + icr > 0, but this is equivalent to a + ia > 0, where in 
a the corresponding components are reversed i.e. a — {—(Ta) ® ^b- Hence we have 
shown 



^ A ^ is the pseudo inverse of a matrix A if AA ^ = A is the projector onto the range of 
A. If A is invertible A~^ is the usual inverse. 
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3.3. Quantum mechanics in ptiase space 



Proposition 3.3.3 A Gaussian state is ppt ijf its correlation matrix a satisfies 



a + ia > with a = 



^UA 
C7B 



(3.69) 



The interesting question is now whether the ppt criterion is for a given number 
of degrees of freedom equivalent to separabihty or not. The foUowing theorem which 



was proved in 1 144 for 1 x 1 systems and in |17C] in 1 x d case gives a complete 
answer. 

Theorem 3.3.4 A Gaussian state of a quantum system with Ixd degrees of freedom 
(i.e. dimX^ = 2 and AitciXb — 2d) is separable iff it is ppt; in other words iff the 



condition of Proposition S.S.o holds 



For other kinds of systems the ppt criterion may fail which means that there 
are entangled Gaussian states which are ppt. A systematic way to construct such 



states can be found in [|170| . Roughly speaking, it is based on the idea to go t o the 



boundary of the set of ppt covariance matrices, i.e. a has to satisfy Equation (3.62) 
and ( 3.69| ) and it has to be a minimal matrix with this property. Using this method 



explicit examples for ppt and entagled Gaussians are constructed for 2x2 degrees 



of freedom (cf. |170| for details) 



3.3.4. Gaussian channels. — Finally we want to give a short review on a special 



class of channels for infinite dimensional quantum systems (cf. |84 for details). To 
explain the basic idea note first that each finite set of Weyl operators {W{xj), 
j = 1,... ,N, Xj Xk for j ^ k) is linear independent. This can be checked 
easily using expectation values of '^j^jW{xj) in Gaussian states. Hence linear 
maps on the space of finite linear combinations of Weyl operators can be defined 
by T[Ty(a::)] = f{x)W{Ax) where / is a complex valued function on V and A is a 
2d X 2c? matrix. If we choose A and / carefully enough, such that some continuity 
properties match T can be extended in a unique way to a linear map on 'B(IK) - 
which is, however, in general not completely positive. 

This means we have to consider special choices for A and /. The most easy 
case arises if / = 1 and A is a symplectic isomorphism, i.e. A^aA = a. If this 
holds the map V 3 x W(Ax) is a representation of the Weyl relations and 
therefore unitarily equivalent to the representation we have started with. In other 
words there is a unitary operator U with T[Ty(a;)] = W{Ax) = UW{x)U* , i.e. 
T is unitarily implemented, hence completely positive and, in fact, well known as 
Bogoluhov transformation. 

If A does not preserve the symplectic matrix, / = 1 is no option. Instead we 
have to choose / such that the matrices 

are positive. Complete positivity of the corresponding T is then a standard result 
of abstract C*-algebra theory (cf. |Q). If the factor / is in addition a Gaussian, 
i.e. f{x) — exp {—\x ■ (3x) for a positive definite matrix (i the cp-map T is called a 
Gaussian channel. 

A simple way to construct a Gaussian channel is in terms of an ancilla repre- 
sentation. More precisely, if A : ^ V" is an arbitrary linear map we can extend 
it to a symplectic map V 3 x ^ Ax ® A'x S V ®V' , where the symplectic vec- 
tor space iy' ,a') now refers to the environment. Consider now the Weyl operator 
W{x) (g) W'{x') = W{x,x') on the Hilbert space ?f (g) associated to the phase 
space element x ® x' £ V (B V. Since A (B A' is symplectic it admits a unitary 
Bogolubov transformation U : :K(g) J{' ^<S)3<' with U*W{x,x')U = W{Ax,A'x). 
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If p' denotes now a Gaussian density matrix on !K' describing the initial state of 
the environment we get a Gaussian channel by 

ti[T*{p)W{x)] ::^tT[p(E)p'U*W{x,x')U] =ti[pW{Ax)]ti[p'W{A'x)]. (3.71) 

Hence T[l¥(a;)] = f{x)W{Ax) with f{x) = tr[p'W{A'x)]. 

Particular examples for Gaussian channels in the case of one degree of freedom 
are attenuation and amplification channels js^ . They are given in terms of a 
real parameter A; 7^ 1 by 9 a; i-^ Ax = kx £ M. 
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3 x^~^ A'x^ Vl - k^x eR^ <1, (3.72) 
for fc < 1 and 

3 {q,p) A'{q,p) = {kq, -up) G with k. = k'^ - 1 (3.73) 

for fc > 1. If the environment is initially in a thermal state p~ (cf. Equation ( ^.64 )) 
this leads to 



T\W{x)\ = exp 



1 /|fc2 - 1 



2 



+ nAx- 



„2 



W{kx), (3.74) 



where we have set N,. = |fc^ — l|iV. If we start initially with a thermal state Pat it is 
mapped by T again to a thermal state p^v with mean photon number N' given by 

iV' = fc27V + max{0,fc^-l} + iVc- (3.75) 

If TVc = this means that T amplifies (fc > 1) or damps (fc < 1) the mean pho- 
ton number, while Nc > leads to additional classical, Gaussian noise. We will 
reconsider this channel in greater detail in Chapter |^. 



Chapter 4 
Basic tasks 



After we have discussed the conceptual foundations of quantum information we 
will consider now some of its basic tasks. The spectrum ranges here from elementary 



processes, like teleportation 4.1 or error correction 4.4, which are building blocks 



for more complex applications, up to possible future technologies like quantum 



cryptography [4.6| and quantum computing L5 . 

4.1 Teleportation and dense coding 

Maybe the most striking feature of entanglement is the fact that otherwise impossi- 
ble machines become possible if entangled states are used as an additional resource. 
The most prominent examples are teleportation and dense coding which we want 
to discuss in this section. 

4.1.1. Impossible machines revisited: Classical teleportation. — We have 
already pointed out in the introduction that classical teleportation, i.e. transmission 
of quantum information over a classical information channel is impossible. With 
the material introduced in the last two chapters it is now possible to reconsider this 
subject in a slightly more mathematical way, which makes the following treatment of 
entanglement enhanced teleportation more transparent. To "teleport" the state p G 
23* (IK) Alice performs a measurement (described by a POV measure . . . , En e 
'B(IK)) on her system and gets a value x E X = {1, . . . , N} with probability px — 
tr{Exp). These data she communicates to Bob and he prepares a system in 
the state px- Hence the overall state Bob gets if the experiment is repeated many 
times is: p = ^.j.ex ^^(^xP)Px (cf- Figure pTl] ). The latter can be rewritten as the 
composition 

S*(J{) ^ e(X)* S*(J{)* (4.1) 

of the channels 

e{X) 3f^ E{f) = /(^)^- e ®(^) (4.2) 

and 

Q*{X)3p^ D*{p)=Y,PxPxe'B*{'K), (4.3) 

i.e. p — D*E*{p) and this Equation makes sense even if X is not finite. The tele- 
portation is successful if the output state p can not be distinguished from the input 
state p by any statistical experiment, i.e. if D*E*{p) — p. Hence the impossibility 
of classical teleportation can be rephrased simply as ED ^ Id for all observables E 
and all preparations D. 

4.1.2. Entanglement enhanced teleportation. — Let us change our setup 
now slightly. Assume that Alice wants to send a quantum state p € 'B*(IK) to 
Bob and that she shares an entangled state a e !B*(!}C (g) 3C) and an ideal classical 
communication channel C(A") — > C(X) with him. Alice can perform a measurement 
E : C(X) ®(Jf (g) 3C) on the composite system !B(5f ® 3C) consisting of the 
particle to teleport ('B(Jf)) and her part of the entangled system (53 (3C)). Then she 
communicates the classical data a; € A" to Bob and he operates with the parameter 
dependent operation D : !B(J{) — > 23(3C) ® C(X) appropriately on his particle (cf. 
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Figure [4.1[ ). Hence the overall procedure can be described by the channel T 
{E (g) Id)!?, or in analogy to (O) 



e*(X) (8)!B*(3<:) ^ •B*(J{) 



(4.4) 



The teleportation of p is successful if 

T*{p ® a) := D* {{E* ® Id)(p ® a)) = p (4.5) 

holds, in other words if there is no statistical measurement which can distinguish 
the final state T*{p (g) a) of Bob's particle from the initial state p of Alice's input 
system. The two channels E and D and the entangled state a form a teleportation 
scheme if Equation (4.5) holds for all states p of the S(!K) system, i.e. if each state 
of a 23 (!K) system can be teleported without loss of quantum information. 

Assume now that !K = DC = and X = {0, . . . , — 1} holds. In this case 
we can define a teleportation scheme as follows: The entangled state shared by 
Alice and Bob is a maximally entangled state a = and Alice performs a 

measurement which is given by the one dimensional projections Ej — |"I>j)($j|, 



0, .. 



, — 1 is a basis of maximally entangled vectors. 
1 Bob has to apply the operation r i— > U*TUj on 

0, . . . ,<P — 1 are an 
Hence the parameter 



where $j G IK (g) j 
If her result is j = 0, . 

his partner of the entangled pair, where the Uj G 23 (Df), j - 
orthonormal family of unitary operators, i.e. iv{U*Uk) = ddjk 
dependent operation D has the form (in the Schrodinger picture): 



e*iX) (g S*(J{) 9 {p,t) ^ D*{p,t) = J2 PjUjtUj e S*(J{). 



(4.6) 



U*AU, 



Therefore we get for T* (p (g) a) from Equation (|4.5| ) 

tr[T*(p (g> a) A] = tr[{E (g) Id)* {p (g> a)D{A)] 

= tr ^ tri2[|$j)($j|(p(ga; 
_j=o 

= tr[(p®a)|<i>,)(<i>,|®([/;A[/,)] 

here tri2 denotes the partial trace over the first two tensor factors (= Alice's qubits) 
If n, the $j and the Uj are related by the equation 



(4.7) 
(4.8) 

(4.9) 



(4.10) 
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Figure 4.1: Entanglement enhanced teleportation 
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4.1. Teleportation and dense coding 



it is a straightforward calculation to show that T*{p ® <j) — p holds as expected 
]16^ . If d = 2 there is basically a unique choice: the <i>j , j = 0, . . . , 3 are the four Bell 
states (cf. Equation ( |3.3[ ), 17 = $o and the Uj are the identity and the three Pauli 
matrices. In this way we recover the standard example for teleportation, published 
for the first time in . The first experimental realizations are . 

4.1.3. Dense coding. — We have just shown how quantum information can be 
transmitted via a classical channel, if entanglement is available as an additional re- 
source. Now we are looking at the dual procedure: transmission of classical informa- 
tion over a quantum channel. To send the classical information x£X = {l,... ,n} 
to Bob, Alice can prepare a d-level quantum system in the state G 23* (Jf), sends 
it to Bob and he measures an observable given by positive operators Ei, . . . , Em ■ 
The probability for Bob to receive the signal y G X if Alice has sent x £ X is 
iv[pxEy) and this defines a classical information channel by (cf. Subsection 3.2.3 ) 



e*(X) 3p^ (E^exPi^) trip.Ei), . . . , E.exM^:) tr(p.£;„)) G e*{X). (4.11) 

To get an ideal channel we just have to choose mutually orthogonal pure states 
Px = \'4'x){'4'x\, X = 1, . . . , d on Alice's side and the corresponding one-dimensional 
projections Ey = \ipy){^y\, y = 1,... ,d on Bob's. If d = 2 and IK = it is 
possible to send one bit classical information via one qubit quantum information. 
The crucial point is now that the amount of classical information can be increased 
(doubled in the qubit case) if Alice shares an entangled state a £ S(M ^ JC) with 
Bob. To send the classical information x E X = {1, . . . ,n} to Bob, Alice operates 
on her particle with an operation : 23 (!K) 23 (JC), sends it through an (ideal) 
quantum channel to Bob and he performs a measurement Ei, . . . , En e 23 {3{ (g) JC) 
on both particles. The probability for Bob to measure y E X ii Alice has send x E X 
is given by 



tr[{D,(g)ld)*{<j)Ey 



(4.12) 



and this defines the transition matrix of a classical communication channel T. If T 



is an ideal channel, i.e. if the transition matrix (4.12) is the identity, we will call E, 
D and a a dense coding scheme (cf . Figure [4.2|) . 



In analogy to Equation (4.4) we can rewrite the channel T defined by (4.12) in 
terms of the composition 



e*{X)(g)'B*{:K)(g>'B*{3i) 



Z5*®Id 



E' 



> s*(j{) (g)S*(:K) e*{x) 



(4.13) 
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Figure 4.2: Dense coding 
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of the parameter dependent operation 

n 

D : e*(X)®'B*(J{) ^ p(g)T^^pjDj{T) (4.14) 

and the observable 

n 

E -.eiX) ^'B('H(g>3{), p^^pjEj, (4.15) 

i.e. T*(p) = E* o {D* (g) Id)(p (g) tr). The advantage of this point of view is that it 
works as well for infinite dimensional Hilbert spaces and continuous observables. 

Finally let us consider again the case where 0{ — and X = {1, . . . If 
we choose as in the last paragraph a maximally entangled vector S !K (g JC, an 
orthonormal base S (g !K, j — x, . . . ,(P of maximally entangled vectors and 
an orthonormal family G 25 (!K ^ 3i) , x = 1, . . . , (f of unitary operators, we can 
construct a dense coding scheme as follows: Er^ = \^x){^x\, Dx{A) = U*AUx and 
CT = If fi, the and the Ux are related by Equation ( 4.10| ) it is easy to see 



that we really get a dense coding scheme [16S|. If d = 2 holds, we have to set again 



the Bell basis for the $2;, = <I>o and the identity and the Pauli matrices for the 
Ux- We recover in this case the standard example of dense coding proposed in jl^ 
and we see that we can transfer two bits via one qubit, as stated above. 

4.2 Estimating and copying 

The impossibility of classical teleportation can be rephrased as follows: It is impos- 
sible to get complete information about the state p of a quantum system by one 
measurement on one system. However, if we have many systems, say N , all prepared 
in the same state p it should be possible to get (with a clever measuring strategy) 
as much information on p as possible, provided N is large enough. In this way we 
can circumvent the impossibility of devices like classical teleportation or quantum 
copying at least in an approximate way. 

4.2.1. Quantum state estimation. — To discuss this idea in a more detailed 
way consider a number N of d-level quantum systems, all of them prepared in the 
same (unknown) state p S ®*(J{). Our aim is to estimate the state p by measure- 
ments on the compound system p®^. This is described in terms of an observable 
E^ : G^Xn) ^(JC®^) with values in a finite subsetQ^Ar C §(?{) of the quantum 



state space §{3-C). According to Subsection 3.2.4 each such E^ is given in terms of a 
tuple E^ , a e X]y, by E{f) — J2a f{^)^^ hence we get for the expectation value 
of an En measurement on systems in the state p®^ the density matrix p^r g §(J{) 
with matrix elements 

We will call the channel E^ an estimator and the criterion for a good estimator 
E^ is that for any one-particle density operator p, the value measured on a state 
p(SN jg ij]jgiy be close to p, i.e. that the probability 



K^(lo) :- tr(£;^(a;)p®^) with E^{ij) ^ (^.17) 



^This is a severe restriction at this point and physically not very well motivated. There might 
be more general (i.e. continuous) observables taking their values in the whole state space S(J{) 
which lead to much better estimates. However we do not discuss this possibility in order to keep 
mathematics more elementary. 
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is small if C §(!K) is the complement of a small ball around p. Of course, we 
will look at this problem for large N. So the task is to find a whole sequence of 



observables , N = 1,2, ... , making error probabilities like (4.17) go to zero as 
N ^ oo. 

The most direct way to get a family , iV e N of estimators with this property 
is to perform a sequence of measurements on each of the N input systems sepa- 
rately. A finite set of observables which leads to a successful estimation strategy is 



usually called a "quorum" (cf. e.g. |107, |162{ ). E.g. for d = 2 we can perform alter- 
nating measurements of the three spin components, li p — ^{TL + x ■ <t) is the Bloch 
representation of p (cf . Subsection |2.1.2| ) we see that the expectation values of these 
measurements are given by ^{1 + Xj). Hence we get an arbitrarily good estimate if 
N is large enough. A similar procedure is possible for arbitrary d if we consider the 



generalized Bloch representation for p (see again Subsection 2.1.2). There are how- 
ever more efficient strategies based on "entangled" measurements (i.e. the En{<7) 
can n ot b e decomposed into pure tensor products) on the whole input system p®^ 



(e.g. |156| , |99[). Somewhat in between are "adaptive schemes" ||6^ consisting of sep- 
arate measurements but the j**^ measurement depend on the results of (j — I)*''. 
We will reconsider this circle of questions in a more quantitative way in Chapter 



4.2.2. Approximate cloning. — By virtue of the no-cloning theorem [|173| , it 
is impossible to produce M perfect copies of a d-level quantum system if < M 
input systems in the common (unknown) state p'^^ are given. More precisely there 
is no channel Tmn ■ ^(ai®*^) ^ S(?{«^^) such that Tli^ip'^^) = P^"' holds for 
all p e §(?{). Using state estimation, however, it is easy to find a device Tmn which 
produces at least approximate copies which become exact in the limit N,M — > oo: 
If p"^^ is given, we measure the observable E^ and get the classical data a e C 
§(J{), which we use subsequently to prepare M systems in the state cr®*^. In other 
words, Tmn has the form 

'B*(M®^) 9 T ^ tr{E^T)a®^ £ !B*(J{®*^). (4.18) 

ireXN 

We see immediately that the probability to get wrong copies coincides exactly with 



the error probability of the estimator given in Equation (4.17). This shows first 
that we get exact copies in the limit N oo and second that the quality of the 
copies does not depend on the number M of output systems, i.e. the asymptotic 
rate limjv,Af ^oo M/N of output systems per input system can be arbitrary large. 

The fact that we get classical data at an intermediate step allows a further 
generalization of this scheme. Instead of just preparing M systems in the state 
a detected by the estimator, we can apply first an arbitrary transformation F : 
S(J{) §(5{) on the density matrix a and prepare F(cr)®^ instead of ct®*^. In this 
way we get the channel (cf. Figure |4.3| ) 



'B*(J{®^) 9 r y tr{E^T)F{a)^^' E !B*(M®*^), (4.19) 



E 



i.e. a physically realizable device which approximates the impossible machine F. The 
probability to get a bad approximation of the state F{p)®^ (if the input state was 
p^^) is again given by the error probability of the estimator and we get a perfect 
realization of F at arbitrary rate as M, N ^ oo. 

There are in particular two interesting tasks which become possible this way: 
The first is the "universal not gate" which associates to each pure state of a qubit the 
unique pure state orthogonal to it p6[ | . This is a special example of a antiunitarily 
implemented symmetry operation and therefore not completely positive. The second 



example is the purification of states |^, |lOO| . Here it is assumed that the input 



4. Basic tasks 



48 




-WJVW^ 








-www>- 








-www>- 


m 


-www>- 






* 


-www>- 




-www>- 


3 



classical data 

aeX CS F{a) E § 
Figure 4.3: Approximating the impossible machine F by state estimation. 



states were once pure but have passed later on a depolarizing channel i— > 
'd\(j>) {(f>\ + {I — 'd)l/d. If -i? > this map is invertible but its inverse does not describe 
an allowed quantum operation because it maps some density operators to operators 
with negative eigenvalues. Hence the reversal of noise is not possible with a one shot 
operation but can be done with high accuracy if enough input systems are available. 
We rediscuss this topic in Chapter ^ 

4.3 Distillation of entanglement 

Let us return now to entanglement. We have seen in Section |4.l| that maximally 
entangled states play a crucial role for processes like teleportation and dense coding. 
In practice however entanglement is a rather fragile property: If Alice produces a 
pair of particles in a maximally entangled state |ri)(ri| e §{Ka^^b) and distributes 
one of them over a great distance to Bob, both end up with a mixed state p which 
contains much less entanglement then the original and which can not be used any 
longer for teleportation. The latter can be seen quite easily if we try to apply 



the qubit teleportation scheme (Subsection 4.1.2) with a non-maximally entangled 
isotropic state (Equation ( |3.15 ) with A > 0) instead of fl. 

Hence the question arises, whether it is possible to recover from p, or, 

following the reasoning from the last section, at least a small number of (almost) 
maximally entangled states from a large number N of copies of p. However since 
the distance between Alice and Bob is big ( and q uantum communication therefore 
impossible) only LOCC operations (Section 3.2.6 ) are available for this task (Alice 
and Bob can only operate on their respective particles, drop some of them and 
communicate classically with one another). This excludes procedures like the pu- 
rification scheme just sketched, because we would need "entangled" measurements 
to get an asymptotically exact estimate for the state p. Hence we need a sequence 
of LOCC channels 



such that 



Tn : S(C'^" (g) C*") ^ S(IKf ^ (g) IK|^) 



\\n{p'^'')-\nN){nN\\\i^O, foriV 



(4.20) 



(4.21) 
Note that 



holds, with a sequence of maximally entangled vectors fl^ € C^" i 
we have to use here the natural isomorphism IK®''^ (g) IK^^ = (IK^ g) J£b)®^, i.e. we 
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have to reshuffle such that the first TV tensor factors belong to Alice {^a) and 
the last N to Bob {'Kb)- If confusion can be avoided we will use this isomorphism 
in the following without a further note. We will call a sequence of LOCC channels, 
T/v satisfying ( 4.2lD with a state p G §(!H^ (K) Kb) a distillation scheme for p and 



p is called distillable if it admits a distillation scheme. The asymptotic rate with 
which maximally entangled states can be distilled with a given protocol is 

liminf log2(dA,)/A^. (4.22) 

n — >oo 

This quantity will become relevant in the framework of entanglement measures 
(Chapter I). 

4.3.1. Distillation of pairs of qubits. — Concrete distillation protocols are in 
general rather complicated procedures. We will sketch in the following how any pair 
of entangled qubits can be distilled. The first step is a scheme proposed for the first 
time by Bennett et. al. It can be applied if the maximally entangled fraction 5" 



(Equation (3.4)) is greater than 1/2. As indicated above, we assume that Alice and 
Bob share a large amount of pairs in the state p, so that the total state is p®^. To 
obtain a smaller number of pairs with a higher 3^ they proceed as follows: 

1. First they take two pairs (let us call them pair 1 and pair 2), i.e. p ^ p and 
apply to each of th em the twirl operation Pyu associated to isotropic states 
(cf. Equation (p^). This can be done by LOCC operations in the following 
way: Alice selects at random (respecting the Haar measure on U(2)) a unitary 
operator U applies it to her qubits and sends to Bob which transformation she 
has chosen; then he applies U to his particles. They end up with two isotropic 
states p®p with the same maximally entangled fraction as p. 

2. Each party performs the unitary transformation 

UxoK ■ \a) ® \h) ^\a)®\a + b mod 2) (4.23) 
on his/her members of the pairs. 

3. Finally Alice and Bob perform locally a measurement in the basis |0), |1) on 
pair 1 and discards it afterwards. If the measurements agree, pair 2 is kept 
and has a higher 3^. Otherwise pair 2 is discarded as well. 

If this procedure is repeated over and over again, it is possible to get states 
with an arbitrarily high 3", but we have to sacrifice more and more pairs and the 
asymptotic rate is zero. To overcome this problem we can apply the scheme above 
until 'J{p) is high enough such that 1 + tr(plnp) > holds and then we continue 
with another scheme called hashing which leads to a nonvanishing rate. 

If finally 'J{p) < 1/2 but p is entangled, Alice and Bob can increase 3^ for some 
of their particles by filtering operations ^ . The basic idea is that Alice applies 
an instrument T : e{X) ® ^(Jf) ^ 'B{'K) with two possible outcomes {X = {1, 2}) 
to her particles. Hence the state becomes p i-^- p~^{Tx ® ld)*{p), x = 1,2 with 
probability Px — tr[r^(p)] (cf. Subsection ^.2.5| in particular Equation ( 3.50| ) for 



the definition of T^;). Alice communicates her measuring result x to Bob and if 
X = 1 they keep the particle otherwise {x = 2) they discard it. If the instrument T 
was correctly chosen Alice and Bob end up with a state p with higher maximally 
entangled fraction. To find an appropriate T note first that there are ip G K ^ K 



with (-0, (Id(8)0)p?A) < (this follows from Theorem 2.4.3| since p is by assumption 



entangled) and second that we can write each vector ip e 3-C(g)3-C as {X^ (g) I)$o_with 



the Bell state $o and an appropriately chosen operator (see Subsection |3.1.lD 
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Now we can define T in terms of the two operations Ti,T2 (cf. Equation ( [3.52| )) 
with 

Ti(^) = X;AX-\ Id -Ti = T2 (4.24) 
It is straightforward to check that we end up with 

(T,®Id)*(p) 



tr[(r,®Id)*(p)] 



(4.25) 



such that 3^{f>) > 1/2 holds and we can continue with the scheme described in the 
previous paragraph. 

4.3.2. Distillation of isotropic states. — Consider now an entangled isotropic 
state p in d dimensions, i.e. we have ^ — and < tr(Fp) < 1 (with the operator 



F of Subsection 3.1.3). Each such state is distillable via the following scheme |2j, p5[ : 
First Ahce and Bob apply a filter operation T : e{X) ® 'B{:K) S(3<) on their 
respective particle given by Ti{A) = PAP, T2 ~ 1 — Ti where P is the projection 
onto a two dimensional subspace. If both measure the value 1 they get a qubit pair 
in the state p = (Ti ® Ti){p). Otherwise they discard their particles (this requires 
classical communication). Obviously the state p is entangled (this is easily checked) 
hence they can proceed as in the previous Subsection. 

The scheme just proposed can be used to show that each state p which violates 



the reduction criterion (cf. Subsection 2.4.3) can be distilled |8q|. The basic idea is to 



project p with the twirl -Putj (which is LOCC as we have seen above; cf. Subsection 



4.3.1) to an isotropic state -Puu(p) ^-Pply afterwards the procedure from the 

last paragraph. We only have to guarantee that Puu(/°) entangled. To this end 
use a vector e M(8)^K with (I®tri(p) — p)^^) < (which exists by assumption 
since p violates the r eduction criterion) and to apply the filter operation given by 



-0 via Equation (|4.24| ) 



4.3.3. Bound entangled states. — It is obvious that separable states are not 
distillable, because a LOCC operation map separable states to separable states. 
However is each entangled state distillable? The answer, maybe somewhat surpris- 
ing, is no and an entangled state which is not distillable is called bound entangled 
p7| (distillable states are sometimes called free entangled, in analogy to thermo- 
dynamics). Examples of bound entangled states are all ppt entangled states \ ^7\ : 
This is an easy consequence of the fact that each separable channel (and therefore 
each LOCC channel as well) maps ppt states to ppt states (this is easy to check), 
but a maximally entangled state is never ppt. It is not yet known, whether bound 
entagled npt states exists, however, there are at least some partial results: 1. It is 
sufficient to solve this question for Werner states, i.e. if we can show that each npt 
Werner state is distillable it follows that all npt states are distillable 2. Each 
npt Gaussian state is distUable |Q. 3. For each N G N there is an npt Werner 
state p which is not "TV-copy distillable", i.e. {tp,p®'^^) > holds for each pure 
state ip with exactly two Schmidt summands |5^ . This gives some evidence for 
the existence of bound entangled npt states because p is distillabile iff it is A'^-copy 
distillability for some N [|^ ||, |||. 

Since bound entangled states can not be distilled, they can not be used for 
teleportation. Nevertheless bound entanglement can produce a non-classical effect, 
called "activation of bound entanglement" . To explain the basic idea, assume 
that Alice and Bob share one pair of particles in a distillable state pf and many 
particles in a bound entangled state pb- Assume in addition that pf can not be 
used for teleportation, or, in other words if p/ is used for teleportation the particle 
Bob receives is in a state a' which differs from the state a Alice has send. This 
problem can not be solved by distillation, since Alice and Bob share only one pair 
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of particles in the state pf. Nevertheless they can try to apply an appropriate filter 
operation on p to get with a certain probability a new state which leads to a better 
quality of the teleportation (or, if the filtering fails, to get nothing at all). It can 
be shown however | |8^ that there are states p/ such that the error occuring in this 
process (e.g. measured by the trace norm distance of a and a') is always above 
a certain threshold. This is the point where the bound entangled states pb come 
into play: If Alice and Bob operate with an appropriate protocol on pf and many 
copies of ph the distance between a and a' can be made arbitrarily small (although 
the probability to be successful goes to zero). Another example for an activation 
of bound entanglement is related to distillability of npt states: If Alice and Bob 
share a certain ppt-entangled state as additional recource each npt state p becomes 
distillable (evem if p is bound entangled) 104 1. For a more detailed survey of 
the role of bound entanglement and further references see ]9lll . 



4.4 Quantum error correction 

If we try to distribute quantum information over large distances or store it for a 
long time in some sort of "quantum memory" we always have to deal with "de- 
coherence effects", i.e. unavoidable interactions with the environment. This results 
in a significant information loss, which is particularly bad for the functioning of a 
quantum computer. Similar problems arise as well in a classical computer, but the 
methods used there to circumvent the problems can not be transferred to the quan- 
tum regime. E.g. the most simple strategy to protect classical information against 
noise is redundancy: instead of storing the information once we make three copies 
and decide during readout by a majority vote which bit to take. It is easy to see that 
this reduces the probability of an error from order e to e^. Quantum mechanically 
however such a procedure is forbidden by the no cloning theorem. 

Nevertheless quantum error correction is possible although we have to do it in a 
more subtle way t han just copying; this was observed for the first time independently 
in and |l46| . Let us consider first the general scheme and assume that T : 
'B(DC) 23 (3C) is a noisy quantum channel. To send quantum systems of type ®(JC) 
undisturbed through T we need an encoding channel E : 23 (3C) — s- 23 (IK) and a 
decoding channel D : ^(K) -> S(3C) such that ETD = Id holds, respectively 
D*T*E* = Id in the Schrodinger picture; cf. Figure 14. 

A powerful error correction scheme should not be restricted to one particular 
type of error, i.e. one particular noisy channel T. Assume instead that C C 23 (3C) is 
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Figure 4.4: Five bit quantum code: Encoding one qubit into five and correcting one 
error. 
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a linear subspace of "error operators" and T is any channel given by 

T*(P)-E^:"^^*' Pj^^- (4.26) 



An isometry V : J( ^ % is called an error correcting code for 2; if for each T of the 
form ( |4.26D there is a decoding channel D : 'B(JC) -> ^(DC) with (r(V^/?y*)) = p 
for all p G §(J{). By the theory of Knill and Laflamme |103| this is equivalent to 
the factorization condition 



{V^P,F*FkV(t>) ^u{F*Fk){iP,4') 



(4.27) 



where uj{F*Fk) is a factor which does not depend on the arbitrary vectors ^/;, € J{. 

The most relevant examples of error correcting codes are those which generalize 
the classical idea of sending multiple copies in a certain sense. This means we 
encode a small number N of c?-level systems into a big number Af ^ of systems 
of the same type, which are then transmitted and decoded back into N systems 
afterwards. During the transmission K < M arbitrary errors are allowed. Hence we 
have :K = Jff^, % = 5{f with J{i = and T is an arbitrary tensor product of 
K noisy channels Sj, j — 1, . . . ,K and M — K ideal channels Id. The most well 
known code for this type of error is the "five-bit code" where one qubit is encoded 
into five and one error is corrected jl^ (cf. Figure iA for N = \,M — b and K — 1). 
To define the corresponding error space € consider the finite sets X — {!,... , iV} 
and Y = {1 + N,. .. ,M + N} and define first for each subset Z C F: 



e(Z) = span{Ai ®---®Am(^'B{X)\ 

Aj e Ti {'Ki ) arbitrary for j 



N e Z, Aj ^1 otherwise}. (4.28) 



£ is now the span of all <B{Z) with \Z\ < K (i.e. the length of Z is less or equal to 
K). We say that an error correcting code for this particular € corrects K errors. 

There are several ways to construct error 
correcting codes (see e.g. Q). Most of 

these methods are somewhat involved how- 
ever and require knowledge from classical er- 
ror correction which we want to skip. There- 
fore we will only present the scheme proposed 




in [137 1, which is quite easy to describe and 
admits a simple way to check the error correc- 
tion condition. Let us sketch first the general 
scheme. We start with an undirected Graph 
r with two kinds of vertices: A set of input 
vertices, labeled by X and a set of output 
vertices labeled by Y . The links of the graph are given by the adjacency matrix, i.e. 
a A + Af X A -f M matrix V with Tjk = 1 if node k and j are linked and Tjk = 
otherwise. With respect to F we can define now an isometry Vt : Jff ^ ^ "Kf^' by 



Figure 4.5: Two graphs belonging to 
(equivalent) five bit codes. The in- 
put node can be chosen in both cases 
arbitrarily. 



{jN+i ■ ■■3N+M\Vr\3i ■ --jN) = exp 



(4.29) 



with j = (ji,... ,jN+M) e (where denotes the cyclic group with d 

elements). There is an easy condition under which Vr is an error correcting code. 
To write it down we need the following additional terminology: We say that an error 
correcting code V : IKf ^ Jff*^ detects the error configuration Z CY if 



{ViIj, FV(i)) = uj{F){i]], (j)) yF e €{z) 



(4.30) 



holds. With Equation (4.27) it is easy to see that V corrects K errors iff it detects 
all error configurations of length 2K or less. Now we have the following theorem: 
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Theorem 4.4.1 The quantum code Vr defined in Equation detects the error 

configuration Z d Y if the system of equations 



implies that 



E 

lexuz 



Tkigi =0, keY\E, gieZd 



(4.31) 



gi ^ 0, I e X and } Tkigi =0, keX 



(4.32) 



holds. 



We omit the proof, see [tL37[ instead. Two particular examples (which are equiv- 
alent!) are given in Figure |4~5f . In both cases we have N = 1, M = 5 and K = 1 
i.e. one input node, which can be chosen arbitrarily, five output nodes and the cor- 
responding codes correct one error. For a more detailed survey on quantum error 
correction, in particular for more examples we refer to ||20[| . 

4.5 Quantum computing 

Quantum computing is without a doubt the most prominent and most far reaching 
application of quantum information theory, since it promises on the one hand, "ex- 
ponential speedup" for some problems which are "hard to solve" with a classical 
computer, and gives completely new insights into classical computing and complex- 
ity theory on the other. Unfortunately, an exhaustive discussion would require its 
own review article. Hence we we are only able to give a short overview (see Part II 
of [122| for a more complete presentation and for further references). 

4.5.1. The network model of classical computing. — Let us start with a 
brief (and very informal) introduction to classical computing (for a more complete 
review and hints for further reading see Chapter 3 of |122|). What we need first is 
a mathematical model for computation. There are in fact several different choices 
and the Turing machine |152| is the most prominent one. More appropriate for our 
purposes is, however, the so called network model, since it allows an easier general- 
ization to the quantum case. The basic idea is to interpret a classical (deterministic) 
computation as the evaluation of a map / : — > B*-'^ (where B = {0, 1} denotes 
the field with two elements) which maps N input bits to M output bits. If M = 1 
holds / is called a boolean function and it is for many purposes sufficient to consider 
this special case - each general / is in fact a Cartesian product of boolean functions. 
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Figure 4.6: Symbols and definition for the three elementary gates AND, OR and 
NOT. 
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Particular examples are the three elementary gates AND, OR and NOT defined in 
Figure 4.6 and arbitrary algebraic expressions constructed from them: e.g. the XOR 
gate {x,y) <—> X + y mod 2 which can be written as (a; V ?/) A -^{x Ay). It is now a 
standard result of boolean algebra that each boolean function can be represented 
in this way and there are in general many possibilities to do this. A special case 
is the disjunctive normal form of /; cf |161|. To write such an expression down in 
form of equations is, however, somewhat confusing. / is therefore expressed most 
conveniently in graphical form as a circuit or network, i.e. a graph C with nodes 
representing elementary gates and edges ("wires") which determine how the gates 
should be composed; cf. Figure 4/7 for an example. A classical computation can now 
be defined as a circuit applied to a specified string of input bits. 

Variants of this model arise if we replace AND, OR and NOT by another (finite) 
set G of elementary gates. We only have to guarantee that each function / can be 
expressed as a composition of elements from G. A typical example for G is the set 
which contains only the NAND gate (x, y) i— > x t 2/ = A y). Since AND, OR 
and NOT can be rewritten in terms of NAND (e.g. = x '\ x) we can calculate 
each boolean function by a circuit of NAND gates. 




X + y mod 2 



Figure 4.7: Half-adder circuit as an example for a boolean network. 



4.5.2. Computational complexity. — One of the most relevant questions 
within classical computing, and the central subject of computational complexity, 
is whether a given problem is easy to solve or not, where "easy" is defined in terms 
of the scaling behavior of the resources needed in dependence of the size of the input 
data. We will give in the following a rough survey over the most basic aspects of 
this field, while we refer the reader to |124| for a detailed presentation. 

To start with, let us specify the basic question in greater detail. First of all the 
problems we want to analyze are decision problems which only give the two possible 
values "yes" and "no" . They are mathematically described by boolean functions 
acting on bit strings of arbitrary size. A well known example is the factoring problem 
given by the function fac with fac(TO, Z) = 1 if to (more precisely the natural number 
represented by m) has a divisor less then / and fac(TO, I) = otherwise. Note that 
many tasks of classical computation can be reformulated this way, so that we do 
not get a severe loss of generality. The second crucial point we have to clarify is the 
question what exactly are the resources we have mentioned above and how we have 
to quantify them. A natural physical quantity which come into mind immediately 
is the time needed to perform the computation (space is another candidate, which 
we do not discuss here, however) . Hence the question we have to discuss is how the 
computation time t depends on the size L of the input data x (i.e. the length L of 
the smallest register needed to represent bit string). 

However a precise definition of "computation time" is still model dependent. 
For a Turing machine we can take simply the number of head movements needed to 
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solve the problem, and in the network model we choose the number of steps needed 
to execute the whole circuit, if gates which operate on different bits are allowed to 
work simultaneously]^ Even with a fixed type of model the functional behavior of t 
depends on the set of elementary operations we choose, e.g. the set of elementary 
gates in the network model. It is therefore useful to divide computational problems 
into complexity classes whose definitions do not suffer under model dependent as- 
pects. The most fundamental one is the class P which contains all problems which 
can be computed in "polynomial time", i.e. t is, as a function of L, bounded from 
above by a polynomial. The model independence of this class is basically the con- 
tent of the strong Church Turing hypotheses which states, roughly speaking, that 
each model of computation can be simulated in polynomial time on a probabilistic 
Turing machine. 

Problems of class P are considered "easy", everything else is "hard". However 
even if a (decision) problem is hard the situation is not hopeless. E.g. consider 
the factoring problem fac described above. It is generally believed (although not 
proved) that this problem is is not in class P. But if somebody gives us a divisor 
p < / of m it is easy to check whether p is really a factor, and if the answer is 
true we have computed fac(m, ^). This example motivates the following definition: 
A decision problem / is in class NP ( "nondeterministic polynomial time" ) if there 
is a boolean function /' in class P such that /'(a;, y) = 1 for some y implies f{x). In 
our example fac' is obviously defined by fac'(m, l,p) = 1 p < I and p is a devisor 
of m. It is obvious that P is a subset of NP the other inclusion however is rather 
nontrivial. The conjecture is that P ^ NP holds and great parts of complexity 
theory are based on it. Its proof (or disproof) however represents one of the biggest 
open questions of theoretical informatics. 

To introduce a third complexity class we have to generalize our point of view 
slightly. Instead of a function / : — > B*'^ we can look at a noisy classical T 
which sends the input value x € to a probability distribution T^y, y E M^'^ o n 



(i.e. T^y is the transition matrix of the classical channel T; cf. Subsection 3.2.3|) 



Roughly speaking, we can interpret such a channel as a probabilistic computation 
which can be realized as a circuit consisting of "probabilistic gates". This means 
there are several different ways to proceed at each step and we use a classical random 
number generator to decide which of them we have to choose. If we run our device 
several times on the same input data x we get different results y with probability 
Txy. The crucial point is now that we can allow some of the outcomes to be wrong 
as long as there is an easy way (i.e. a class P algorithm) to check the validity of 
the results. Hence we define BPP ( "bounded error probabilistic polynomial time" ) 
as the class of all decision problems which admit a polynomial time probabilistic 
algorithm with error probability less than 1/2 — e (for fixed e). It is obvious that 
P C BPP holds but the relation between BPP and NP is not known. 

4.5.3. Reversible computing. — In the last subsection we have discussed the 
time needed to perform a certain computation. Other physical quantities which seem 
to be important are space and energy. Space can be treated in a similar way as time 
and there are in fact space-related complexity classes (e.g PSPACE which stands 
for "polynomial space"). Energy, however, is different, because it turns surprisingly 
out that it is possible to do any calculation without expending any energyl One 
source of energy consumption in a usual computer is the intrinsic irreversibility 
of the basic operations. E.g. a basic gate like AND maps two input bits to one 
output bit, which implies obviously that the input can not be reconstructed from 



^Note that we have glanced over a lot of technical problems at this point. The crucial difficulty 
is that each circuit Cjv allows only the computation of a boolean function /jv : — » B which 
acts on input data of length A^. Since we are interested in answers for arbitrary finite lenp-t h inp uts 
a sequence Cjvi Af G N of circuits with appropriate uniformity properties is needed; cf. |124| for 
details. 
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the output. In other words: one bit of information is erased during the operation 
of the AND gate, hence a small amount of energy is dissipated to the environment. 
A thermodynamic analysis, known as Landauer's principle, shows that this energy 
loss is at least ksT lii2, where T is the temperature of the environment ]106[ |. 

If we want to avoid this kind of energy dissipation we are restricted to reversible 
processes, i.e. it should be possible to reconstruct the input data from the output 
data. This is called reversible computation and it is performed in terms of reversible 
gates, which in turn can be described by invertible functions / : — s- . This does 
not restrict the class of problems which can be solved however: We can repackage 
a non-invertible function / : B^ ^ B^^ into an invertible one /' : 1^+*-^ B^+*^ 
simply by f'{x, 0) = {x, f{x)) and an appropriate extension to the rest of B^+^^. It 
can be even shown that a reversible computer performs as good as a usual one, i.e. 
an "irreversible" network can be simulated in polynomial time by a reversible one. 
This will be of particular importance for quantum computing, because a reversible 
computer is, as we will see soon, a special case of a quantum computer. 

4.5.4. The network model of a quantum computer. — Now we are ready 
to introduce a mathematical model for quantum co mputat ion. To this end we will 
generalize the network model discussed in Subsection 4.5.1 to the network model of 
quantum computation. 
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One qubit gate. Controlled U gate. CNOT gate. 

Figure 4.8: Universal sets of quantum gates. 



A classical computer operates by a network of gates on a finite number of classical 
bits. A quantum computer operates on a finite number of qubits in terms of a 
network of quantum gates - this is the rough idea. To be more precise consider the 
Hilbert space with J{ = which describes a quantum register consisting 

of N qubits. In 3i there is a preferred set |0), |1) of orthogonal states, describing 
the two values a classical bit can have. Hence we can describe each possible value 
X of a classical register of length A'^ in terms of the computational basis \x) = 
\xi)iS)- ■ ■iS)\xn), X G B^. A quantum gate is now nothing else but a unitary operator 
acting on a small number of qubits (preferably 1 or 2) and a quantum network is a 
graph representing the composition of elementary gates taken from a small set G of 
unitaries. A quantum computation can now be defined as the application of such a 



network to an input state tp of the quantum register (cf. Figure 4.9 for an example). 
Similar to the classical case the set G should be universal; i.e. each unitary operator 
on a quantum register of arbitrary length can be represented as a composition of 
elements from G. Since the group of unitaries on a Hilbert space is continuous, it 
is not possible to do this with a finite set G. However we can find at least suitably 
small sets which have the chance to be realizable technically (e.g. in an ion-trap) 
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somehow in the future. Particular examples are on the one hand the controlled U 
operations and the set consisting of CNOT and all one-qubit gates on the other (cf. 
Figure 4.8; for a proof of universality see Section 4.5 of [122|). 
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Figure 4.9: Quantum circuit for the discrete Fourier transform on a 4-qubit register. 



Basically we could have considered arbitrary quantum operations instead of only 
unitaries as gates. We have seen however in Subsection 3.2.1 that we can implement 
each operation unitarily if we add an ancilla to the systems. Hence this kind of gen- 
eralization is already covered by the model. (As long as non-unitarily implemented 
operations are a desired feature. Decoherence effect due to unavoidable interaction 
with the environment are a completely different story; we come back to this point 
at the end of the Subsection.) The same holds for measurements at intermediate 
steps and subsequent conditioned operations. In this case we get basically the same 
result with a different network where all measurements are postponed to the end. 
(Often it is however very useful to allow measurements at intermediate steps as we 
will see in the next Subsection.) 

Having a mathematical model of quantum computers in mind we are now ready 
to discuss how it would work in principle. 

1. The first step is in most cases preprocessing of the input data on a classical 
computer. E.g. the Shor algorithm for the factoring problem does not work if 
the input number m is a pure prime power. However in this case there is an 
efficient classical algorithm. Hence we have to check first whether m is of this 
particular form and use this classical algorithm where appropriate. 

2. Based on these preprocessed data we have to prepare the quantum register in 
the next step. This means in the most simple case to write classical data, i.e. 
to prepare the state \x) £ "K®^ if the (classical) input is a; G B''^. In many 
cases however it might be more intelligent to use a superposition of several 
|a;), e.g. the state 



1 



E 



(4.33) 



which represents actually the superposition of all numbers the registers can 
represent - this is indeed the crucial point of quantum computing and we 
come back to it below. 

Now we can apply the quantum circuit C to the input state and after the 
calculation we get the output state Uip, where U is the unitary represented 
byC. 
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To read out the data after the calculation we perform a von Neumann mea- 
surement in the computational basis, i.e. we measure the observable given by 
the one dimensional projectors x G B^. Hence we get x G with 

probability Pn = \{ip\x)\'^. 

Finally we have to postprocess the measured value a; on a classical computer 
to end up with the final result x' . If, however, the output state U"^ is a proper 
superposition of basis vectors \x) (and not just one |a;)) the probability Px 
to get this particular x' is less than 1. In other words we have performed a 
probabilistic calculation as described in the last paragraph of Subsection 4.5.2| . 



Hence we have to check the validity of the results (with a class P algorithm 
on a classical computer) and if they are wrong we have to go back to step ^ 

So, why is quantum computing potentially useful? First of all, a quantum com- 
puter can perform at least as good as a classical computer. This follows immediately 



from our discussion of reversible computing in Subsection 4.5.3 and the fact that 
any invertible function / : B^ B^ defines a unitary hy Uf : \x) i-^ |/(a;)) (the 



quantum CNOT gate in Figure 4.8 arises exactly in this way from the classical 
CNOT). But, there is on the other hand strong evidence which indicates that a 
quantum computer can solve problems in polynomial time which a classical com- 
puter can not. The most striking example for this fact is the Shor algorithm, which 
provides a way to solve the factoring problem (which is most probably not in class 
P) in polynomial time. If we introduce the new complexity class BQP of decision 
problems which can be solved with high probability and in polynomial time with a 
quantum computer, we can express this conjecture as BPP 7^ BQP. 

The mechanism which gives a quantum computer its potential power is the 
ability to operate not just on one value x G , but on whole superpositions 
of values, as already mentioned in step ^ above. E.g. consider a, not necessarily 



invertible, map / : B^ B^^ and the unitary operator Uj 



If we let act Uf on a, register in the state 5" |0) from Equation ( 4.33| ) we get the 
result 

[//(* ® |0)) = -i= ^ |x) ® |/(x)). (4.35) 

Hence a quantum computer can evaluate the function / on all possible arguments 
a; e B^ at the same time! To benefit from this feature - usually called quantum 
parallelism - is, however, not as easy as it looks like. If we perform a measurement 
on Uf{'i> (8) |0)) in the computational basis we get the value of / for exactly one 
argument and the rest of the information originally contained in C//(^ (8) |0)) is 
destroyed. In other words it is not possible to read out all pairs (x, /(x)) from 
?7/(^ (Xi |0)) and to fill a (classical) lookup table with them. To take advantage 
from quantum parallelism we have to use a clever algorithm within the quantum 
computation step (step ^ above). In the next section we will consider a particular 
example for this. 

Before we come to this point, let us give some additional comments which link 
this section to other parts of quantum information. The first point concerns entan- 
glement. The state ?7/(^ (X) |0)) is highly entangled (although is separable since 

"if = [2^^/^(|0) + |1))]^^), and this fact is essential for the "exponential speedup" 
of computations we could gain in a quantum computer. In other words, to out- 
perform a classical computer, entanglement is the most crucial resource - this will 
become more transparent in the next section. The second remark concerns error 
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correction. Up to now we have assumed implicitly all components of a quantum 
computer work perfectly without any error. In reality however decoherence effects 
make it impossible to realize unitarily implemented operations, and we have to deal 
with noisy channels. Fortunately it is possible within quantum information to cor- 



rect at least a certain amount of errors, as we have seen in Section 4.4). Hence unlike 
an analog computer^ a quantum computer can be designed fault tolerant, i.e. it can 
work with imperfectly manufactured components. 

4.5.5. Simons problem. — We will consider now a particular problem (known 



as Simons problem; cf. [143|) which shows explicitly how a quantum computer 
can speed up a problem which is hard to solve with a classical computer. It does 
not fit however exactly into the general scheme sketched in the last subsection, 
because a quantum "oracle" is involved, i.e. a black box which performs an (a priori 
unknown) unitary transformation on an input state given to it. The term "oracle" 
indicates here that we are not interested in the time the black box needs to perform 
the calculation but only in the number of times we have to access it. Hence this 
example does not prove the conjecture BPP ^ BQP stated above. Other quantum 
algorithms which we have not the room here to discuss include: the Deutsch 
and Deutsch- Josza problem |^ |, the Grover search algorithm ^ and of course 



Shor's factoring algorithm [ 139 



140| 



Hence let us assume that our black box calculates the unitary Uf from Equation 
(4.34) with a map / : which is two to one and has period a, i.e. f{x) — 



f{y)iSy — x + a mod 2. The task is to find a. Classically, this problem is hard, i.e. 
we have to query the oracle exponentially often. To see this note first that we have 
to find a pair (x, y) with f{x) = f{y) and the probability to get it with two random 
queries is (since there is for each x exactly one y ^ x with f{x) — f{y))- If we 
use the box 2^/^ times, we get less than 2^/^ different pairs. Hence the probability 
to get the correct solution is 2"^^^, i.e. arbitrarily small even with exponentially 
many queries. 

Assume now that we let our box act on a quantum register "K^^ ® in the 

state * (g) |0) with ^ from Equation ( ff.33D to get Uf{^ (g) |0)) from ( [4.35| ). Now 



we measure the second register. The outcome is one of 2^~^ possible values (say 
f{xo j), each of which occurs equiprobable. Hence, after the measurement the first 
register is the state 2~^^'^{\x) -I- |a; -I- a)). Now we let a Hadamard gate H (cf. Figure 
^ ) act on each qubit of the first register and the result is (this follows with a short 
calculation) 

-^H^^'^ilx) + \x + a)) = J2 i-^r'\y) (4-36) 

where the dot denotes the (B- valued) scalar product in the vector space B^. Now 
we perform a measurement on the first register (in computational basis) and we get 
a y G B''^ with the property y • a = 0. If we repeat this procedure N times and if we 
get N linear independent values j/j we can determine a as a solution of the system 
of equations t/i • a = 0, . . . , j/jv ■ a = 0. The probability to appear as an outcome of 
the second measurement is for each y with y ■ a = given by 2^"-'^. Therefore the 
success probability can be made arbitrarily big while the number of times we have 
to access the box is linear in N. 

4.6 Quantum cryptography 

Finally we want to have a short look on quantum cryptography - another more 
practical application of quantum information, which has the potential to emerge into 



^If an analog computer works reliably only with a certain accuracy, we can rewrite the algorithm 
into a digital one. 
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technology in the not so distant future (see e.g. |95[ Q for some experimental 
realizations and p9| for a more detailed overview). Hence let us assume that Alice 
has a message x e which she wants to send secretly to Bob over a public 
communication channels. One way to do this is the so called "one-time pad": Alice 
generates randomly a second bit-string y G of the same length as x sends x + y 
instead of x. Without knowledge of the key y it is completely impossible to recover 
the message x from x + y. Hence this is a perfectly secure method to transmit 
secret data. Unfortunately it is completely useless without a secure way to transmit 
the key y to Bob, because Bob needs y to decrypt the message x + y (simply by 
adding y again) . What makes the situation even worse is the fact that the key y can 
be used only once (therefore the name one-time pad). If two messages Xi, X2 are 
encrypted with the same key we can use Xi as a key to decrypt X2 and vice versa: 
{xi + y) + {x2 + y) ~ xi + X2^ hence both messages are partly compromised. 

Due to these problems completely different approaches, namely "public key sys- 
tems" like DSA and RSA are used today for cryptography. The idea is to use two 
keys instead of one: a private key which is used for decryption and only known to its 
owner and a public key used for encryption, which is publicly available (we do not 
discuss the algorithms needed for key generation, encryption and decryption here, 
see [145 1 and the references therein instead). To use this method. Bob generates 
a key pair {z,y), keeps his private key (y) at a secure place and sends the public 
one (z) to Alice over a public channel. Alice encrypts her message with z sends 
the result to Bob and he can decrypt it with y. The security of this scheme relies 
on the assumption that the factoring problem is computationally hard, i.e. not in 
class P, because to calculate y from z requires the factorization of large integers. 
Since the latter is tractable on quantum computers via Shor's algorithm, the secu- 
rity of public key systems breaks down if quantum computers become available in 
the future. Another problem of more fundamental nature is the unproven status of 
the conjecture that factorization is not solvable in polynomial time. Consequently, 
security of public key systems is not proven either. 

The crucial point is now that quantum information provides a way to distribute 
a cryptographic key y in a secure way, such that y can be used as a one-time 
pad afterwards. The basic idea is to use the no cloning theorem to detect possible 
eavesdropping attempts. To make this more transparent, let us consider a particular 
example here, namely the probably most prominent protocol proposed by Benett 
and Brassard in 1984 @. 

1. Assume that Alice wants to transmit bits from the (randomly generated) key 
y G through an ideal quantum channel to Bob. Before they start they 
settle upon two orthonormal bases eo, ei e JC, respectively /o, /i G 9i, which 
are mutually nonorthogonal, i.e. |(ej, fk)\ > e > with e big enough for each 
j,k = 0,1. If photons are used as information carrier a typical choice are 
linearly polarized photons with polarization direction rotated by 45° against 
each other. 

2. To send one bit j S B Alice selects now at random one of the two bases, say 
eo,ei and then she sends a qubit in the state |ej)(ej| through the channel. 
Note that neither Bob nor a potential eavesdropper knows which bases she 
has chosen. 

3. When Bob receives the qubit he selects, as Alice before, at random a base and 
performs the corresponding von Neumann measurement to get one classical 
bit fc e B, which he records together with the measurement method. 

4. Both repeat this procedure until the whole string y G is transmitted 
and then Bob tells Alice (through a classical, public communication channel) 
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bit for bit which base he has used for the measurement (but not the result 
of the measurement). If he has used the same base as AHce both keep the 
corresponding bit otherwise they discard it. They end up with a bit-string 
yi g jjM reduced length M. If this is not sufficient they have to continue 
sending random bits until the key is long enough. For large N the rate of 
successfully transmitted bits per bits sended is obviously 1/2. Hence Alice has 
to send approximately twice as many bits as they need. 

To see why this procedure is secure, assume now that the eavesdropper Eve can 
listen and modify the information sent through the quantum channel and that she 
can listen on the classical channel but can not modify it (we come back to this 
restriction in a minute). Hence Eve can intercept the qubits sent by Alice and make 
two copies of it. One she forwards to Bob and the other she keeps for later analysis. 
Due to the no cloning theorem however she has produced errors in both copies and 
the quality of her own decreases if she tries to make the error in Bob's as small 
as possible. Even if Eve knows about the two bases eo,ei and /o,/i she does not 
know which one Alice uses to send a particular qubit^. Hence Eve has to decide 
randomly which base to choose (as Bob). If eo,ei and /o,/i are chosen optimal, 
i.e. |(ej, /fe)p = 0.5 it is easy to see that the error rate Eve necessarily produces if 
she randomly measures in one of the bases is 1/4 for large N. To detect this error 
Alice and Bob simply have to sacrify portions of the generated key and to compare 
randomly selected bits using their classical channel. If the error rate they detect is 
too big they can decide to drop the whole key and restart from the beginning. 

So let us discuss finally a situation where Eve is able to intercept the quantum 
and the classical channel. This would imply that she can play Bob's part for Alice 
and Alice's for Bob. As a result she shares a key with Alice and one with Bob. 
Hence she can decode all secret data Alice sends to Bob, read it, and encode it 
finally again to forward it to Bob. To secure against such a "woman in the middle 
attack" , Alice and Bob can use classical authentication protocols which ensure that 
the correct person is at the other end of the line. This implies that they need a 
small amount of initial secret material which can be renewed however from the new 
key they have generated through quantum communication. 



^If Alice and Bob uses only one basis to send the data and Eve knows about it she can produce 
of course ideal copies of the qubits. This is actually the reason why two nonorthogonal bases are 
necessary. 



Chapter 5 
Entanglement measures 



We have seen in the last chapter that entanglement is an essential resource for 
many tasks of quantum information theory, like teleportation or quantum compu- 
tation. This means that entangled states are needed for the functioning of many 
processes and that they are consumed during operation. It is therefore necessary 
to have measures which tell us whether the entanglement contained in a number 
of quantum systems is sufficient to perform a certain task. What makes this sub- 
ject difficult, is the fact that we can not restrict the discussion to systems in a 
maximally or at least highly entangled pure state. Due to unavoidable decoherence 
effects realistic applications have to deal with imperfect systems in mixed states, 
and exactly in this situation the question for the amount of available entanglement 
is interesting. 

5.1 General properties and definitions 

The difficulties arising if we try to quantify entanglement can be divided, roughly 
speaking, into two parts: First we have to find a reasonable quantity which describes 
exactly those properties which we are interested in and second we have to calculate 
it for a given state. In this section we will discuss the first problem and consider 
several different possibilities to define entanglement measures. 

5.1.1. Axiomatics. — First of all, we will collect some general properties which 



a reasonable entanglement measure should have (cf. also ||16, 154, 153, 155, 8^). To 
quantify entanglement, means nothing else but to associate a positive real number 
to each state of (finite dimensional) two-partite systems. 

Axiom EO An entanglement measure is a function E which assigns to each state 
p of a finite dimensional bipartite system a positive real number E{p) G M+. 

Note that we have glanced over some mathematical subtleties here, because E 
is not just defined on the state space of ^(Jf 3C) systems for particularly chosen 
Hilbert spaces 'K and 3C - E is defined on any state space for arbitrary finite 
dimensional 3-C and 3C. This is expressed mathematically most conveniently by a 
family of functions which behaves naturally under restrictions (i.e. the restriction 
to a subspace ^K' (g) %' coincides with the function belonging to (E) OC'). However 
we will see soon that we can safely ignore this problem. 

The next point concerns the range of E. If p is unentangled E{p) should be 
zero of course and it should be maximal on maximally entangled states. But what 
happens if we allow the dimensions of !K and 3C to grow? To get an answer consider 
first a pair of qubits in a maximally entangled state p. It should contain exactly 
one bit entanglement i.e. E{p) — 1 and N pairs in the state p®^ should contain 
N bits. If we interpret p®^ as a maximally entangled state of a Jf (g) IK system 
with !K = we get E{p^^) = log2(dim(3-C)) = N, where we have to reshuffle in 
p®^ the tensor factors such that {C^(g)C^)^^ becomes (C^)®^ (g) (C^)®^ (i.e. "aU 



Alice particles to the left and all Bob particles to the right"; cf. Section 4.3.) This 
observation motivates the following. 

Axiom El (Normalization) E vanishes on separable and takes its maximum on 
maximally entangled states. This means more precisely that E(a) < E{p) — log2(d) 
for p,<7 (z §{!K (g) IK) and p maximally entangled. 

One thing an entanglement measure should tell us, is how much quantum infor- 
mation can be maximally teleported with a certain amount of entanglement, where 
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this maximum is taken over all possible teleportation schemes and distillation pro- 
tocols, hence it can not be increased further by additional LOCC operations on the 
entangled systems in question. This consideration motivates the following Axiom. 

Axiom E2 (LOCC monotonicity) E can not increase under LOCC operation, 
i.e. E[T{p)] < E{p) for all states p and all LOCC channels T. 

A special case of LOCC operations are of course local unitary operations U ®V . 
Axiom ^ imphes now that E{U (g) V pU* ® V*) < E{p) and on the other hand 
E(U* (g> V*pU (g>V) < E{p) hence with p = U® VpU* O F we get E{p) < E{U (g) 
VpV*(gU*) therefore E{p) = E{U(gVpU*(gV*). We fix this property as a weakened 
version of Axiom pil2[ 

Axiom E2a (Local unitary invariance) E is invariant under local unitaries, 
i.e. E{U ® VpU* (g) V*) — E[p) for all states p and all unitaries U, V. 

This axiom shows why we do not have to bother about families of functions 
as mentioned above. If E is defined on §{'K (g) J{) it is automatically defined on 
SCKi (8) ^2) for all Hilbert spaces with dim(Jf/j) < dim(J{), because we can 
embed IKi (8) IK2 under this condition unitarily into JC (8 JC. 

Consider now a convex linear combination Ap + (1 — X)a with < < 1- 
Entanglement can not be "generated" by mixing two states, i.e. E{Xp+ (f — A)(t) < 
XE{p) + (1 - X)E{a). 

Axiom E3 (Convexity) E is a convex function, i.e. E{Xp+ {I — X)a) < XE{p) + 
(1 — X)E{a) for two .states p, a and < A < 1. 

The next property concerns the continuity of i.e. if we perturb p slightly 
the change of E{p) should be small. This can be expressed most conveniently as 
continuity of E in the trace norm. At this point however it is not quite clear, how we 
have to handle the fact that E is defined for arbitrary Hilbert spaces. The following 
version is motivated basically by the fact that it is a crucial assumption in Theorem 
5T| and |T|. 



Axiom E4 (Continuity) Consider a sequence of Hilbert spaces M^v, N & N and 
two sequences of states pN,crN G §i{JiN ^^n) with lmi\\pN ~ ctnWi = 0. Then we 
have 

linr ^ 0. (5.1) 

AT^oo 1 + log2(dim J{w) 

The last point we have to consider here are additivity properties: Since we are 
looking at entanglement as a resource, it is natural to assume that we can do with 
two pairs in the state p twice as much as with one p, or more precisely E{p p) — 
2E{p) {in p(E) p we have to reshuffle tensor factors again ;see above). 

Axiom E5 (Additivity) For any pair of two-partite states p,a e §(IK (g) X) we 
have E{a (g) p) = E{a) + E{p). 

Unfortunately this rather natural looking axiom seems to be too strong (it ex- 
cludes reasonable candidates). It should be however always true that entanglement 
can not increase if we put two pairs together. 

Axiom E5a (Subadditivity) For any pair of states p, a we have E{p (g cr) < 
E{p)+Eia). 

There are further modifications of additivity available in the literature. Most 



frequently used is the following, which restricts Axiom E5 to the case p — a 



Axiom E5b (Weak additivity) For any state p of a bipartite system we have 
N-^E{p<^^)^E{p). 
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Finally, the weakest version of additivity only deals with the behavior of E for 
large tensor products, i.e. p®^ for N — > oo. 

Axiom E5c (Existence of a regularization) For each state p the limit 
exists. 

5.1.2. Pure states. — Let us consider now a pure state p = G §(3i (g) 

%). If it is entangled its partial trace a = ti^ — trac !''/')('/' I is niixed and 

for a maximally entangled state it is maximally mixed. This suggests to use the 
von Neumann entropy]^ of p, which measures how much a state is mixed, as an 
entanglement measure for mixed states, i.e. we define ||^, 

£^vn(p) = -tr[trMpln(trjr p)]. (5.3) 

It is easy to deduce from the properties of the von Neumann entropy that i?vN 



satisfies Axioms EO, El, E3 and E5b. Somewhat more difficult is only Axiom E2 



which follows however from a nice theorem of Nielsen [119| which relates LOCC 
operations (on pure states) to the theory of majorization. To state it here we need 
first some terminology. Consider two probability distributions A = (Ai,... , Xm) 
and p = (pi,... ,pAr) both given in decreasing order (i.e. Ai > ... > Aj\/ and 
Pi > • • • > Pw). We say that A is majorized by p, in symbols A -< p, if 

fe k 

yk = l,... ,mmM,N (5.4) 



holds. Now we have the following result (see |119| for a proof). 

Theorem 5.1.1 A pure state ip = Y.J \ (g) e^- e » 3C can be transformed 

^ 1/2 

into another pure state (j) = p^ fj (g) /j € IK (g) 3C via a LOCC operation, iff the 
Schmidt coefficients of are majorized by those of (j), i.e. X ^ fi. 

The von Neumann entropy of the restriction tT^\ip){^\ can be immediately 
calculated from the Schmidt coefficients A of i/j by i?vN(|V') ("01) = ^ X^i li^(^j)- 
Axiom Ez follows therefore from the fact that the entropy 5'(A) = — ^ Xj In(Aj) 



of a probability distribution A is a Shur concave function, i.e. A ^ p implies S{X) > 



S'(p); see |121[| 



Hence we have seen so far that i?vN is one possible candidate for an entanglement 
measure on pure states. In the following we will see that it is in fact the only 
candidate which is physically reasonable. There are basically two reasons for this. 
The first one deals with distillation of entanglement. It was shown by Bennett et. 
al. that each state tp G K g) 3C of a bipartite system can be prepared out of 
(a possibly large number of) systems in an arbitrary entangled state <j) by LOCC 
operations. To be more precise, we can find a sequence of LOCC operations 

Tjv : S [(K ® 3C)®*'^W] ^ S [(K g) OC)^^] (5.5) 

such that 

lim \\ni\cj^){cbr'')-\^p)m^^o (5.6) 



^We assume here and in thn fnll nwing that the reader is sufBciently famihar with entropies. If 
this is not the case we refer to |l23|. 
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holds with a nonvanishing rate r = hiiiAr^oo M(N)/N. This is done either by dis- 
tiUation (r < 1 if is higher entangled then 0) or by "diluting" entanglement, 
i.e. creating many less entangled states from few highly entangled ones (r > 1). 
All this can be performed in a reversible way: We can start with some maximally 
entangled qubits dilute them to get many less entangled states which can be dis- 
tilled afterwards to get the original states back (again only in an asymptotic sense). 
The crucial point is that the asymptotic rate r of these processes is given in terms 
of ii'vN by r = i?vN(|0)(0|)/£^vN(|V')(V'l)- Hence we can say, roughly speaking that 
^'vn(|V')(^I) describes exactly the amount of maximally entangled qubits which is 
contained in 

A second somewhat more formal reason is that i?vN is the only entanglement 
measure on the set of pure states which satisfies the axioms formulated above. In 
other words the following ^''uniqueness theorem for entanglement measures''^ holds 
[ 12S| , |l5^,[57| 

Theorem 5.1.2 The reduced von Neumann entropy £^vN is the only entanglement 



measure on pure states which satisfies Axioms EL - E5 



5.1.3. Entanglement measures for mixed states. — To find reasonable en- 
tanglement measures for mixed states is much more difficult. There are in fact many 
possibilities (e.g. the maximally entangled fraction introduced in Subsection |3.1.l| 
can be regarded as a simple measure) and we want to present therefore only four 
of the most reasonable candida tes. Among those measures which we do not discuss 
here are negativi ty qu antities (|15S] and the references therein) the "best separable 
approximation" |108|, the base norm associated with the set of separable states 
157j , 136] and ppt-distillation rates |133|. 



The first measure we want to present is oriented along the discussion of pure 
states: We define, roughly speaking, the asymptotic rate with which maximally 
entangled qubits can be distilled at most out of a state p € §(!K ® %) as the 
Entanglement of Distillation Ed(p) of p; cf ]l^ . To be more precise consider all 
possible distillation protocols for p (cf. Section |4.3| ), i.e. all sequences of LOCC 
channels 



Tjv : S(C''" «) C''") ^ !B(5{®^ (g) 3C®^) 



(5.7) 



such that 



N- 



hm |lT*(p®^)-|l]Ar)(flAr||li = 



(5. 



holds with a sequence of maximally entangled states I^at G C*" . Now we can define 

log2(d7v) 



Ed{p) 



sup lim sup ■ 



N- 



N 



(5.9) 



where the supremum is taken over all possible distillation protocols (TAf)jveN- It 
is not very difficult to see that E'd satisfies EOl El, ES and E5b. It is not known 



whether continuity (E4) and convexity (Axiom E3) holds. It can be shown however 
that Ed is not convex (and not additive; Axiom E5) if npt bound entangled states 
exist (see |141|, cf. also Subsection 4.3.3 ). 

For pure states we have discussed beside distillation the "dilution" of entan- 
glement and we can use, similar to E^, the asymptotic rate with which bipartite 
systems in a given state p can be prepared out of maximally entangled singlets ]78| . 
Hence consider again a sequence of LOCC channels 



(5.10) 
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and a sequence of maximally entangled states ^In G C'*" , N E N, but now with the 
property 



lim \\p^ 

N^oo 



T^i\nN){nN\)\ 



0. 



Then we can define the entanglement cost Eq{p) of p as 
Ec{p)^ inf hminfi^^i^M, 



(5.11) 



(5.12) 



where the infimum is taken over all dilution protocols 5* at, e N. It is again easy to 
see that Eq satisfies 



EO, El 



E2 



and E5b. In contrast to £"13 however it can be shown 
that Eq is convex (Axiom E3), while it is not known, whether Eq is continuous 
(Axiom E4); cf ||7|] for proofs. 

i^D and Eq are based directly on operational concepts. The remaining two mea- 
sures we want to discuss here are defined in a more abstract way. The first can be 
characterized as the minimal convex extension of i?vN to mixed states: We define 
the entanglement of formation E-p of p as JIm 



]pjE^N{\ijj){^ljj\), 



(5.13) 



where the infimum is taken over all decompositions of p into a convex sum of pure 

and |120| for ^ the rest follows 



E2 



states. Ep satisfies ^ - ^ and |E5a) (cf. |l^ for 
directly from the definition). Whether Ep is (weakly) additive (Axiom E5b) is not 
known. Furthermore it is conjectured that E-p coincides with Eq. However proven 
is only the identity E^ = Eq, where the existence of the regularization E^ of Ep 
follows directly from subadditivity. 

Another idea to quantify entanglement is to measure the "dist ance " of the (en- 
tangled) p from the set of separable states D. It hat turned out |154| that among 
all possible distance functions the relative entropy is physically most reasonable. 
Hence we define the relative entropy of entanglement as 



E^(p)^ MS{p\a), 



S{p\a)^ [tr(plog2p-plog2cr)], 



(5.14) 



where the infimum is take n ov e r a ll separable stat es. It can be shown that E^ 
satisfies, as Ep the Axioms EO - E4 and E5a , where El and E2 are show n in | 154 | 
and E4 in the r est fo llows directly from t he d efinition. It is shown in |159| that 
Epi does not satisfy E5b ; cf. also Subsection |5.3| . Hence the regularization E^ of 



E'r differs from i?R . 

Finally let us give now some comments on the relation between the measures just 
introduced. On pure states all measures just discussed, coincide with the reduced 
von Neumann entropy - this follows from Theorem 5.1.2 and the properties stated in 



the last Subsection. For mixed states the situation is more difficult. It can be shown 
however that Ep, < Eq holds and that all "reasonable" entanglement measures lie 
in between [891. 



Theorem 5.1.3 For each entanglement measure E satisfying EL, El, Ei and E5b 
and each state p G S(J{ (g) 3C) we have Ep){p) < E{p) < Ec{p). 

Unfortunately no measure we have discussed in the last Subsection satisfies all 
the assumptions of the theorem. It is possible however to get a similar statement for 
the regularization E°° with weaker assumptions on E itself (in particular without 
assuming additivity); cf js?!] . 
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5.2 Two qubits 

Even more difficult than finding reasonable entanglement measures are explicit cal- 
culations. All measures we have discussed above involve optimization processes over 
spaces which grow exponentially with the dimension of the Hilbert space. A direct 
numerical calculation for a general state p is therefore hopeless. There are however 
some attempts to get either some bounds on entanglement measures or to get ex- 
plicit calculations for special classes of states. We will concentrate this discussion to 
some relevant special cases. On the one hand we will concentrate on E-p and and 
on the other we will look at two special classes of states where explicit calculations 
are possible: Two qubit systems in this section and states with symmetry properties 
in the next one. 

5.2.1. Pure states. — Assume for the rest of this section that IK = holds and 
consider first a pure state ip G 'K(E)0{. To calculate i?vN(V') is of course not difficult 
and it is straightforward to see that (cf. for all material of this and the following 
subsection [^61): 



i(l + Vl-C(# 



holds, with 

H{x) = -a;log2(a:) - (1 - x) log2(l - x) 
and the concurrence Cijp) of ip which is defined by 



with t/j = Oij^jj 

3=0 



(5.15) 



(5.16) 



(5.17) 



where j = 0, . . . ,3 denotes the Bell basis (p^). Since C becomes rather im- 
portant in the following let us reexpress it as C(V') — \{ip,'^ip)\j where ip "E.ijj 
denotes complex conjugation in Bell basis. Hence S is an antiunitary operator and 
it can be written as the tensor product S = ^ (8) ^ of the map 5{. 3 (f) t—^ cr2(t>, where 
(j) denotes complex conjugation in the canonical basis and tT2 is the second Pauli 
matrix. Hence local unitaries (i.e. those of the form Ui (8) U2) commute with S and 
it can be shown that this is not only a necessary but also a sufficient condition for 



a unitary to be local |16C|. 

We see from Equations (5.15) and ( |3.17 ) that C{tp) ranges from to 1 and that 
£'vn('0) is a monotone function in C{tp). The latter can be considered therefore as 
an entanglement quantity in its own right. For a Bell state we get in particular 
C($j) — 1 while a separable state 0i (8) (j>2 leads to C(</>i (8 (j)2) = 0; this can be seen 
easily with the factorization 2 = ^(8)^. 

Assume now that one of the aj say ao satisfies \ao\'^ > 1/2. This implies that 
C{ip) can not be zero since 



< 1- |ao| 



(5.18) 



must hold. Hence C{ip) is at least 1 — 2|aoP and this implies for i?vN arbitrary 



E^^{i;)>h{\{<i>o,tP)f) with;i(a;) 



X < h 



(5.19) 
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This inequality remains valid if we replace <I>o by any other maximally entangled 
state $ S K(E)'K. To see this note that two maximally entangled states $, $' G IK® JC 
are related (up to a phase) by a local unitary transformation Ui®U2 (this follows 



immediately from their Schmidt decomposition; cf Subsection 3.1.1). Hence, if we 
replace the Bell basis in Equation (5.17) by = Ui® f^2^j, J = 0, . . . , 3 we get for 



the corresponding C" the equation C'{i})) = (t/i (g) SC/^ (g) t/li/') = C(V') since S 
commutes with local unitaries. We can even replace K^OiV")!^ with the supremum 
over all maximally entangled states and get therfore 

E^M>h[J{\i^){M)l (5.20) 
where is the maximally entangled fraction of !')/')('/' I which we have intro- 



duced in Subsection 3.1.1 



To see that even equality holds in Equation (5.20) note first that it is sufficient to 
consider the case i/' = a|00) -|-6|11) with a, 5 > 0, + = 1, since each pure state ijj 
can be brought into this form (this follows again from the Schmidt decomposition) 
by a local unitary transformation which on the other hand does not change -EvN- 
The maximally entangled state which maximizes |('0, ^)\'^ is in this case $o and we 
get = (a + 6)^/2 = 1/2 + ab. Straightforward calculations show now that 

= h{l/2 + ab) ^E^^[ip) holds as stated. 

5.2.2. EOF for Bell diagonal states. — It is easy to extend the inequality 



( ^.20 ) to mixed states if we use the convexity of E-p and the fact that Ey coincides 



with E'vN on pure states. Hence ( 5.20D becomes 



EF{p)>h[j{p)\. (5.21) 

For general two qubit states this bound is not achieved however. This can be see 
with the example p = l/2(|(/)i)(0i| + |00) (00|), which we have considered already 



in the last paragraph of Subsection 3.1.l|. It is easy to see that lip) — 1/2 holds 



hence ft.[3^(p)] — but p is entangled. Nevertheless we can show that equality holds 



in Equation ( |5.21 ) if we restrict it to Bell diagonal states p = X]j=o j) j\- To 

prove this statement we have to find a convex decomposition p ~ X]j 

of such a p into pure states such that /i[j(p)] = Mj^vn(|*j) 



holds. Since Ep{p) can not be smaller than /i[?'(/9)] due to inequality (5.21) this 
decomposition must be optimal and equality is proven. 

To find such 'i'j assume first that the biggest eigenvalue of p is greater than 1/2, 
and let, without loss of generality, be Ai this eigenvalue. A good choice for the 'i'j 
are then the eight pure states 




The reduced von Neumann entropy of all these states equals h{Xi), hence 
X]jMj£^vN(|^'j)(^j|) = ^(Ai) and therefore Ep{p) — fe (Ai). Since the maximally 
entangled fraction of p is obviously Ai we see that ( ^.2l| ) holds with equality. 

Assume now that the highest eigenvalue is less than 1/2. Then we can find phase 
factors exp{i(j)j) such that J2j=o exp(i0j)Aj — holds and p can be expressed as a 
convex linear combination of the states 

gi</'o/2 



Ao^o + « ^(±6**^/2^^7)$^ • (5.23) 



The c oncurrence C of all these states is hence their entanglement is by Equation 
( 5.15 ), which in turn implies -Bf(p) = 0. Again we see that equality is achieved in 
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(5.21) since the maximally entangled fraction of p is less than 1/2. Smumarizing 
this discussion we have shown (cf. Figure 5.1) 



Proposition 5.2.1 A Bell diagonal state p is entangled iff its highest eigenvalue 
A is greater than 1/2. In this case the Entanglement of Formation of p is given by 



Ek{p) 



Ef{p) = H 



i + VA(l-A) 



(5.24) 




Highest eigenvalue A of p 

Figure 5.1: Entanglement of Formation and Relative Entropy of Entanglement for 
Bell diagonal states, plotted as a function of the highest eigenvalue A of p 



5.2.3. Wootters for mula . — If we have a general two qubit state p there is 
a formula of Wootters 172 which allows an easy calculation of Ep. It is based 
on a generalization of the concurrence C to mixed states. To motivate it rewrite 



C2(V;) =tr(|V)(Vl|SV')(SV|) - tr(pSpS) = tr(i?2) 



with 



^ = V Vp^p^Vp- 



(5.25) 



(5.26) 



Here we have set p = \ip){ip\. The definition of the hermitian matrix R however 
makes sense for arbitrary p as well. If we write Xj,j — 1, . . . ,4 for the eigenvalues of 
R and Ai is without loss of generality the biggest one we can define the concurrence 
of an arbitrary two qubit state p as ]17!^ 

(5.27) 



C{p) = max(0, 2Ai - tr(i?)) = max(0, Ai - A2 - A3 - A4). 



It is easy to see that C{\ip) {'ip\) coincides with C{^) from ( ^.17] ). The crucial point 
is now that Equation ( 5.15 ) holds for Ep{p) if we insert C{p) instead of C{ip): 



Theorem 5.2.2 (Wootters Formula) The Entanglement of Formation of a 
two qubit system in a state p is given by 



Ef{p) - H 



^(i + Vi-c(p)2 



(5.28) 
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where the concurrence of p is given in Equation (5.21) and H denotes the binary 
entropy from (5.11). 



To prove this theorem we have to find first a convex decomposition p = 
of p into pure states j such that the average reduced von Neu- 
mann entropy pjE^^{'^j) coincides with the right hand side of Equation ( ^.28 ). 
Second we have to show that we have really found the minimal decomposition. Since 
this is much more involved than the simple case discussed in Subsection 5.2.2| we 
omit the proof and refer to |172| instead. Note however that Equation (5.28) really 
coincides with the special cases we have derived for pure and Bell diagonal states. 
Finally let us add the remark that there is no analogon of Wootters' formula for 



higher dimensional Hilbert spaces. It can be shown [160| that the essential properties 



of the Bell basis j = 0, .., 3 which would be necessary for such a generalization 
are available only in 2 x 2 dimensions. 

5.2.4. Relative entropy for Bell diagonal states. — To calculate the Relative 
Entropy of Entanglement i?R for two qubit systems is more difficult. However there 
is at least an easy formula for Bell diagonal states which we will give in the following; 



(1541 



Proposition 5.2.3 The Relative Entropy of Entanglement for a Bell diagonal state 



p with highest eigenvalue A is given by ( cf. Figure 5.1) 



Proof. For a Bell diagonal state p = Yl^j=o I have to calculate 



Ek(,p) = inf [tr(plog2 p - plogj cr)] 



tr(plog2 p) + inf 



^A,($„log2(a)$,) 



(5.29) 

(5.30) 
(5.31) 



Since log is a concave function wc have — log2($j, cr$j) < — log2((T)$j) and 
therefore 



Er{p) > tr(plog2p) 



inf 



-^Ajlog2($j,a$,) 



(5.32) 



Hence only the diagonal elements of a in the Bell basis enter the minimization 
on the right hand side of this inequality and this implies that we can restrict the 
infimum to the set of separable Bell diagonal state. Since a Bell d iagonal state is 
separable iff all its eigenvalues are less than 1/2 (Proposition 5.2.1 ) we get 



Eji{p) > tT{p\og2 p) + inf 

Pje[04/2] 



^Ajlog2Pj 



with J2pj = 1- (5-33) 



This is an optimization problem (with constraints) over only four real parameters 
and easy to solve. If the highest eigenvalue of p is greater than 1/2 we get pi — 1/2 
and pj = Aj/(2 — 2A), where we have chosen without loss of generality A = Ai. We 
get a lower bound on -E'r(p) which is achieved if we insert the corresponding a in 
Equation (5.31). Hence we have proven the statement for A > 1/2. which completes 
the proof, since we have seen already that A < 1/2 implies that p is separable 
(Proposition |5.2.l|) . □ 
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5.3 Entanglement measures under symmetry 

The problems occuring if we try to calculate quantities like -Er or Ep for general 
density matrices arise from the fact that we have to solve optimization problems 
over very high dimensional spaces. One possible strategy to get explicit results is 
therefore parameter reduction by symmetry arguments. This can be done if the state 
in question admits some invariance properties like Werner, isotropic or 00-invariant 
states; cf. Section 3.1. We will give in the following some particular examples for 
such calculations, while a detailed discussion of the general idea (together with much 
more examples and further references) can be found in |159|. 

5.3.1. Entanglement of Formation. — Consider a compact group of unitaries 
G C 'B{'K(S)'K) (where 5{ is again arbitrary finite dimensional), the set of G- invariant 
states, i.e. all p with [V, p] = for allVGG and the corresponding twirl operation 
Pg<7 = jQVaV*dV. Particular examples we are looking at are: 1. Werner states 
where G consists of all unitaries U ®U 2. Isotropic states where each V £ G has the 
form V = U ® U and finally 3. 00-invariant states where G consists of unitaries 



U ®U with real matrix elements {U = U) and the twirl is given in Equation (3.24). 

One way to calculate E-p for a G-invariant state p consists now of the following 
steps: 1. Determine the set Mp of pure states $ such that PG|*i')(*i'| = P holds. 2. 
Calculate the function 



Pg§ 3 P ^ ecip) - inf{SvN(CT) | a e Mp} e M, 



(5.34) 



where we have denoted the set of G-invariant states with Pg§- 3. Determine Ep{p) 
then in terms of the convex hull of e, i.e. 



Ep{p) = inf{^Aje(crj) 



e Pg§, < a, < 1, p = Y.Mj^ = !}• (5.35) 



The equality in the last Equation is of course a non-trivial statement which has to 
be proved. We skip this point, however, and refer the reader to |159|. The advantage 
of this scheme relies on the fact that spaces of G invariant states are in general very 
low dimensional (if G is not too small). Hence the optimization problem contained 
in step 3 has a much bigger chance to be tractable than the one we have to solve for 
the original definition of Ep . There is of course no guarantee that any of this three 
steps can be carried out in a concrete situation. For the three examples mentioned 
above, however, there are results available, which we will present in the following. 

5.3.2. Werner states. — Let us start with Werner states [159|. In this case p is 
uniquely determined by its flip expectation value tr(pF) (cf. Subsection 3.1.2| ). To 
determine ^ € % ® % such that Puu|'J')('J'| = P holds, we have to solve therefore 
the equation 



($,F<i>) 

jk 



$fe, = tr(Fp), 



(5.36) 



where denote components of $ in the canonical basis. On the other hand the 
reduced density matrix p = tri |<I>)($| has the matrix elements pjk = J2i ^ji^ki- 
By exploiting U ®U invariance we can assume without loss of generality that p is 
diagonal. Hence to get the function euu "we have to minimize 



£;vN(|<f)(<f|) =5^5 



(5.37) 



under the constraint (5.36), where S{x) = — x log 
entropy. We skip these calculations here (see ll5£ 



i{x) denotes the von Neumann 
instead) and state the results 
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EFip) 




tr(pF) 



Figure 5.2: Entanglement of Formation for Werner states plotted as function of the 
flip expectation. 



only. For tr{Fp) > Owe get e{p) — (as expected since p is separable in this case) 
and with H from ( ^.16 ) 



euu(p) = H 



(5.38) 



for tr(Fp) < 0. The minima are taken for $ where all except one diagonal 
element are zero in the case tr{Fp) > and for <I> with only two (non-diagonal) 
coefficients ^jk,^kj, j 7^ k nonzero if tr(pF) < 0. The function e is convex and 
coincides therefore with its convex hull such that we get 



Proposition 5.3.1 For any Werner state p the Entanglement of Formation is 



given by (cf. Figure 5.i) 



Mp) = 



H 



^(l- ^1 - tr(Fp)2^ tr{Fp) < 

tr(Fp) > 0. 



(5.39) 



5.3.3. Isotropic states. — Let us consider now isotropic, i.e. U ®U invariant 
states. They are determined by the expectation value tv^pF) with F from Equation 
( |3.14 ). Hence we have to look first for pure states $ with ($,F<i>) — tr{pF) (since 
this determines, as for Werner states above, those $ with -Puu(l^)(*^'l) — P)- To 
this end assume that $ has the Schmidt decomposition $ = Xjfj (g) /j — Ui 
U2 X]i ^j^j with appropriate unitary matrices J7i, U2 and the canonical basis 
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ip) 
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tT{pF) 



Figure 5.3: e-function for isotopic states plotted as a function of the flip expectation. 
For d > 2 it is not convex near the right endpoint. 



ej, j — 1, . . . ,d. Exploiting the U ®U invariance of p we get 



tr(pF) = / (I ® F) ^ XjCj (E) ej,F{l ® ^ A, 



fcCfc <E)ek) (5.40) 
\ J k I 

efe (8) Vek) (5.41) 

2 



(5.42) 



with V = U1U2 and after inserting the definition of F. Following our general 
scheme, we have to minimize i^vN (1*5*) ('i' I) under the constraint given in Equation 
( 5.42| ). This is explicitly done in |15C]. We will only state the result here, which 
leads to the function 



euu(p) 



'i7(7) + (l-7)log2(d-l) tr(pF)>i 
tr(pF) < 



with 



cP 



iv{pF) + ^[d-l][d-tv{pF)] 



(5.43) 



(5.44) 



For d > 3 this function is not convex (cf. Figure 5.3), hence we get 



Proposition 5.3.2 For any isotropic state the Entanglement of Formation is given 
as the convex hull 



(5.45) 



of the function in Equation {5.4-S) 
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5.3.4. OO-invariant states. — The results derived for isotropic and Werner 
states can be extended now to a large part of the set of OO-invariant states with- 
out solving new minimization problems. This is possible, because the definition of 

Ep in Equation (5.13) allows under some conditions 
an easy extension to a suitable set of non-symmetric 
states. If more precisely a nontrivial, minimizing de- 
composition p — "YlijVi\''\'j)k^i\ of P is known, all 
states p' which are a convex linear combination of the 
same 1-0 ,-) (?/; ./I but arbitrary have the same -Ep as p 



3 




2 




' / A 








Y" 






^ B 




1 



(see [y_59| for proof of the statement). F or the general 



Figure 5.4: State space of 
OO-invariant states. 



scheme we have presented in Subsection 5.3.1 this im- 
plies the following: If we know the pure states a G Mp 
which solve the minimization problem for e{p) in Equa- 
tion (5.34) we get a minimizing decomposition of p in 
terms oi U G G translated copies of cr. This follows 
from the fact that p is by definition of Mp the twirl of 
a. Hence any convex linear combination of pure states 
UaU* with U G G has the same Ey as p. 



A detailed analysis of the corresponding optimiza- 
tion problems in the case of Werner and isotropic states 
instead) leads therefore to the following 



(which we have omitted here; see [159, 15C 
results about OO-invariant states: The space of OO-invariant states decomposes 
into four regions: The separable square and three triangles A, B, C; cf. Figure 
For al l state s p in triangle A we can calculate E-p [p) as for Werner states in Propo- 
sition 5.3.1 and in triangle B we have to apply the result for isotropic states from 
Proposition 5.3.2[ This implies in particular that Ey depends in A only on iT:{pF) 
and in B only on tv{pF) and the dimension. 



5.3.5. Relative Entropy of Entanglement. — To calculate En^p) for a sym- 
metric state p is even easier as the treatment of E-p (p) , because we can restrict the 
minimization in the definition of i?R(p) in Equation (5.14) to G-invariant separable 
states, provided G is a group of local unitaries. To see this assume that a G D 
minimizes S{p\a) for a G-invariant state p. Then we get S{p\UaU*) = S{p\a) for 
all f/ € G since the relative entropy S is invariant under unitary transformations 
of both arguments and due to its convexity we even get S{p\PGcr) < S{p\a). Hence 
Pgc minimizes S{p\ ■ ) as well, and since Pgct G D holds for a group G of local 
unitaries, we get Epi{a,p) — S{p\PG<y) as stated. 
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tv{pF) 

Figure 5.5: Relative Entropy of Entanglement for Werner states, plotted as a func- 
tion of the flip expectation. 



The sets of Werner and isotropic states are just intervals and the corresponding 
separable states form subintervals over which we have to perform the optimization. 
Due to the convexity of the relative entropy in both arguments, however, it is 
clear that the minimum is attained exactly at the boundary between entangled and 
separable states. For Werner states this is the state ctq with tv^Faa) = 0, i.e. it 
gives equal weight to both minimal projections. To get Eji{p) for a Werner state p 
we have to calculate therefore only the relative entropy with respect to this state. 
Since all Werner states can be simultaneously diagonalized this is easily done and 
we get: 



Similarly, the boundary point ai for isotropic states is given by tr(i^fTi) = 1 which 
leads to 



for each entangled isotropic state p, and if /? is separable. {S{pi,p2) denotes here 
the entropy of the probability vector {pi,P2)-) 
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Er{p) 



d=2 
d=3 
d=4 



tripF) 



Figure 5.6: Relative Entropji^of Entanglement for isotropic states and d — 2,3,4, 
plotted as a function of tr{pF). 



Let us consider now 00-invariant states. As for EOF we divide the state space 



into the separable square and the three triangles A, B, C; cf. Figure 5.4. The state 



at the coordinates (1, d) is a maximally entangled state and all separable states on 
the line connecting (0,1) with (1,1) minimize the relative entropy for this state. 
Hence consider a particular state a on this line. The convexity property of the 
relative entropy shows immediately that ct is a minimizer for all states on the line 
connecting a with the state at (l,d). In this way it is easy to calculate i?R(p) for 
all p in A. In a similar way we can treat the triangle B: We just have to draw a line 
from p to the state at (—1,0) and find the minimizer for p at the intersection with 
the separable border between (0,0) and (0, 1). For all states in the triangle C the 
relative entropy is minimized by the separable state at (0, 1). 

An application of the scheme just reviewed is a proof that is not additive, i.e. 
it does not satisfy Axiom E5b . To see this consider the state p — tr(P_)~^i-'_ where 
P_ denotes the projector on the antisymmetric subspace. It is a W erne r state with 
flip expectation —1 (i.e. it corresponds to the point (—1,0) in Figure 5^). According 
to our discussion above S{p\ ■ ) is minimized in this case by the separable state cto 
and we get i?R(p) — 1 independently of the dimension d. The tensor product p®^ 
can be regarded as a state in §(!H®^ (8> IK®^) with U iS)U iS)V symmetry, where 
U, V are unitaries on 3-C. Note that the corresponding state space of UUVV invariant 
states can be parameterized by the expectation of the three operators F (g) I, TL(E) F 
and F 1^ F (cf. |159( |) and we can apply the machinery just described to get the 
minimizer a of S{p\ ■). If d > 2 holds it turns out that 

~ '^+^ '^-^ "_«P_ (5.48) 



2dtr(P+)2 



2dtr(P_) 



holds (where P± denote the projections onto the symmetric and antisymmetric 
subspaces of J{ J{) and not = do (8) ctq as one would expect. As a consequence 
we get the inequality 

E^ip^^) = 2 - log, (^^) < 2 = Sip^^Wf^) = 2EM (5.49) 
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d = 2 is a special case, where af^ and a (and all their convex linear combination) 
give the same value 2. Hence for d > 2 the Relative Entropy of Entanglement is, as 
stated, not additive. 



Chapter 6 
Channel capacity 



In Section 4.4 we have seen that it is possible to send (quantum) information 
undisturbed through a noisy quantum channel, if we encode one qubit into a (pos- 
sibly long and highly entangled) string of qubits. This process is wasteful, since 
we have to use many instances of the channel to send just one qubit of quantum 
information. It is therefore natural to ask, which resources we need at least if we 
are using the best possible error correction scheme. More precisely the question is: 
With which maximal rate, i.e. information sent per channel usage, we can transmit 
quantum information undisturbed through a noisy channel? This question naturally 
leads to the concept of channel capacities which we will review in this chapter. 

6.1 The general case 

We are mainly interested in classical and quantum capacities. The basic ideas behind 
both situations are however quite similar. In this section we will consider therefore 
a general definition of capacity which applies to arbitrary channels and both kinds 



of information. (See also |169| as a general reference for this section.) 



6.1.1. The definition. — Hence consider two observable algebras Ai, Ai and an 
arbitrary channel T : A\ A2- To send systems described by a third observable 
algebra S undisturbed through T we need an encoding channel E : A^ — > 25 and a 
decoding channel D : 'B ^ Ai such that ETD equals the ideal channel 23 ^ 23, i.e. 
the identity on 23. Note that the algebra 23 describing the systems to send, and the 
input respectively output algebra of T need not to be of the same type, e.g. 23 can 
be classical while Ai,A2 are quantum (or vice versa). 

In general (i.e. for arbitrary T and 23) it is of course impossible to find such a pair 
E and D. In this case we are interested at least in encodings and decodings which 
make the error produced during the transmission as small as possible. To make this 
statement precise we need a measure for this error and there are in fact many good 
choices for such a quantity (all of them leading to equivalent results, cf. Subsection 



6.3.1). We will use in the following the "cb-norm difference" \\ETD — Id ||cb, where 
Id is the identity (i.e. ideal) channel on 23 and || • ||cb denotes the norm of complete 
boundedness ("cb-norm" for short) 

||r||eb = sup||T®Id„||, Id„ :S(C")^B(C") (6.1) 

The cb-norm improves the sometimes annoying property of the usual operator norm 
that quantities like ||T (g) Ids(C'') II rnay increase with the dimension d. On infinite 
dimensional observable algebras ||T||cb can be infinite although each term in the 
supremum is finite. A particular example for a map with such a behavior is the 
transposition on an infinite dimensional Hilbert space. A map with finite cb-norm 
is therefore called completely bounded. In a finite dimensional setup each linear 
map is completely bounded. For the transposition Q on we have in particular 
||0||cb — d. The cb-norm has some nice features which we will use frequently; this 
includes its multiphcativity ||ri(8iT'2||cb = H?"! ||cb||T2||cb and the fact that ||T||cb = 1 
holds for each (unital) channel. Another useful relation is ||T||cb — \\T ® Id'B(M) II 7 
which hold s if T is a map 23(3-C) 23 (IK). For more properties of the cb-norm let 



us refer to | 125 



Now we can define the quantity 



A(r, S) = mf \\ETD - Ids llcb, (6.2) 
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where the infimum is taken over all channels E : A2 ^ and D : 'B ^ Ai and H'b 
is again the ideal !B-channel. A describes, as indicated above, the smallest possible 
error we have to take into account if we try to transmit one 25 system through 



one copy of the channel T using any encoding E and decoding D. In Section 4.4, 
however, we have seen that we can reduce the error if we take M copies of the 
channel instead of just one. More generally we are interested in the transmission 
of "codewords of length" N, i.e. CB**^ systems using M copies of the channel T. 
Encodings and decodings are in this case channels of the form E : Af^' ^ 
respectively D : S®^ Af^'. If we increase the number M of channels the error 
A(T®^^ decreases provided the rate with which N grows as a function of 

M is not too large. A more precise formulation of this idea leads to the following 
definition. 

Definition 6.1.1 LetT he a channel andT) an observable algebra. A number c > 
is called achievable rate for T with respect to 23, if for any pair of sequences Mj , Nj , 
j G N with Mj ~f 00 and \iinswpj_^^ Nj/Mj < c we have 

lim A(T®*^^■B®^0 = 0. (6.3) 

The supremum of all achievable rates is called the capacity of T with respect to 25 
and denoted by C'(T,'B). 

Note that by definition c = is an achievable rate hence C(T, 25) > 0. If on 
the other hand each c > is achievable we write C(T, 25) = cx3. At a first look 
it seems cumbersome to check all pairs of sequences with given upper ratio when 
testing c. Due to some monotonicity properties of A, however, it can be shown that 
it is sufhcient to check only one sequence provided the Mj satisfy the additional 
condition Alj / (Mj^i) 1. 

6.1.2. Simple calculations. — We see that there are in fact many different 
capacities of a given channel depending on the type of information we want to 
transmit. However, there are only two different cases we are interested in: 25 can 
be cither classical or quantum. We will discuss both special cases in greater detail 
in the next two sections. Before we do this, however, we will have a short look on 
some simple calculations which can be done in the general case. To this end it is 
convenient to introduce the notations 

Md = S(C'^) and = e({l, . . . , rf}) (6.4) 

as shorthand notations for 23 (C^) and C({1, . . . ,d}) since some notations become 
otherwise a little bit clumsy. First of all let us have a look on capacities of ideal 
channels. If Idjvtj f^nd Idcj denote the identity channels on the quantum algebra 
M / respectively the classical algebra C/ we get 

C(Ide, , Md) = 0, C(Ide, , Gd) = C(IdM, , M^) = C(IdM, , Gd) = . (6.5) 

The first equation is the channel capacity version of the no-teleportation theorem: 
It is impossible to transfer quantum information through a classical channel. The 
other equations follow simply by counting dimensions. 

For the next relation it is convenient to associate to a pair of channels T, S the 
quantity C(T,S) which arises if we replace in Definition |6.1.1| and Equation (3.2) 



the ideal channel Ids by an arbitrary channel S. Hence C{T,S) is a slight gener- 
alization of the channel capacity which describes with which asymptotic rate the 
channel 5* can be approximated by T (and appropriate encodings and decodings). 
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These generahzed capacities satisfy the two step coding inequality, i.e. for the three 
channels Ti , T2 , T3 we have 

CiT3,Ti)>C{T2,Ti)Cin,T2). (6.6) 

To prove it consider the relations 

D, + E^Tf^'D^ - E^E^Tf'D^D^W,^ (6.7) 
DiU + \\EiU\\tT' ~ E^Tf'' D2U\\D,\U (6.8) 
i^illcb + ||Tf - E^Tf^'D^U (6.9) 

where we have used for the last inequality the fact that the cb-norni of a 
channel is one. If ci is an achievable rate of Ti with respect to T2 such that 
limsupj^oc iWj/iVj < ci and C2 is an achievable rate of T2 with respect to T3 
such that hmsupj^oo NjjKj < C2 we see that 

hm sup — ^ = lim sup — < hm sup — hm sup -— . (6.10) 

j — >oo J^j j — i-oo j J^j j — i-oo j k — »oo 

If we choose the sequences Mj , Nj and Kj clever enough (cf . the remark following 
Definition |6.1.l| ) this implies that C1C2 is an achievable rate for Ti with respect to 
T3 and this proves Equation ( |6.6D. 

As a first application of ( |6.6D , we can relate all capacities C(T, M^) (and 
C{T,Qd)) for different d to one another. If we choose T3 ^ T, Ti — and 
T2 = IdMf we get with (fj) C{T,Md) < |2i3-£c(T, M/), and exchanging d with / 



shows that even equality holds. A similar relation can be shown for C{T, Cd). Hence 
the dimension of the observable algebra 23 describing the type of information to be 
transmitted, enters only via a multiplicative constant, i.e. it is only a choice of units 
and we define the classical capacity Cc{T) and the quantum capacity Cq{T) of a 
channel T as 



Cc{T)^C{T,e2), C,{T)^C{T,M2). (6.11) 

A second application of Equation ( |6.6[ ) is a relation between the classical and 
the quantum capacity of a channel. Setting T3 — T, Ti — Idea ^^.'^ ^2 = Idjvta we 
get again with ( |6.5[ ) 

Cg{T) < C,{T). (6.12) 

Note that it is now not possible to interchange the roles of C2 and M2. Hence 
equality does not hold here. 

Another useful relation concerns concatenated channels: We transmit informa- 
tion of type 23 first through a channel Ti and then through a second channel T2 ■ 
It is reasonable to assume that the capacity of the composition T2T1 can not be 
bigger than capacity of the channel with the smallest bandwidth. This conjecture 
is indeed true and known as the Bottleneck inequality" : 

C(T2Ti,S) < min{C(Ti,S),C(T2,'B)}. (6.13) 

To see this consider an encoding and a decoding channel E respectively D for 
{T^Ti)'^^\ i.e. in the definition of C(T2Ti,!B) we look at 

II Id|^ ^E{T2Tif^'^ D\\,^ = II Id|^ -{ETf^)T®^ D\\,x,. (6.14) 
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This implies that ETf^ and D are an encoding and a decoding channel for Ti. 
Something similar holds for D and Tf"^^ D with respect to T2. Hence each achievable 
rate for T2T1 is also an achievable rate for T2 and Ti, and this proves Equation ( |6.13| ). 

Finally we want to consider two channels Ti, T2 in parallel, i.e. we consider the 
tensor product Ti (g) T2. If Ej, Dj, j = 1,2 are encoding, respectively decoding 



channels for T®^ and ^ such that || Id 



0Nj 



=b < e holds, we get 



Id-Id®(£;2T®*^i:>2) + ld(E)iE2T'=^''''D2) -El® E2{Ti 



' D2\\ch 

(6.15) 

< \\ld®{ld-E2T^^' D2\\ch + Wild-EiTf'^' Di) ® E2T^^' D2\\ch (6.16) 

< II Id-£;2r®*'i?2||cb + II Id-SiTf ^'^i^illeb < 2e (6.17) 



Hence ci + C2 is achievable for Ti 
inequality 



T2 if Cj is achievable for Tj. This implies the 



CiTi ® T2, S) > C(ri, S) + C(T2, S) 



(6.18) 



When all channels are ideal, or when all systems involved are classical even equality 
holds, i.e. channel capacities are additive in this case. However, if quantum channels 
are considered, it is one of the big open problems of the field, to decide under which 
conditions additivity holds. 

6.2 The classical capacity 

In this section we will discuss the classical capacity Cc{T) of a channel T. There 
are in fact three different cases to consider: T can be either classical or quantum 
and in the quantum case we can use either ordinary encodings and decodings or a 



dense coding scheme (cf. Subsection 4.1.3). 

6.2.1. Classical channels. — Let us consider first a classical to classical channel 
T : Q{Y) 'iiX). This is basically the situation of classical information theory 
and we will only have a short look here - mainly to show how this (well known) 
situation fits into the general scheme described in the last section|^. 

First of all we have to calculate the error quantity A(r, 62) defined in Equation 
(6.2). As stated in Subsection [3.2.3 T is completely determined by its transition 
probabilities Txy, {x,y) € X x Y describing the probability to receive x X when 
y (z Y was sent. Since the cb-norm for a classical algebra coincides with the ordinary 
norm we get (we have set X — Y ior this calculation): 



Id-T||eb= ||Id-T|| =sup 

= 2sup(i-r,,) 



{Sxy Txy) fy 



(6.19) 
(6.20) 



where the supremum in the first equation is taken over all / € G{X) with ||/|| — 
sup J, \fy\ < 1. We see that the quantity in Equation ( |6.2C| ) is exactly twice the 
maximal error probability, i.e. the maximal probability of sending x and getting 
anything different. Inserting this quantity for A in Definition 6.1.1 applied to a 
classical channel T and the "bit-algebra" 23 = 62 , we get exactly Shannons classical 



definition of the capacity of a discrete memoryless channel 1 138 



Hence we can apply Shannons noisy channel coding theorem to calculate Cc{T) 
for a classical channel. To state it we have to introduce first some terminology. 
Consider therefore a state p E G*{X) of the classical input algebra C{X) and its 

^Please note that this impUes in particular th at wo H n rint give a complete review of the 
foundations of classical information theory here; cf 1 101 , 62], ^S|] instead. 
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image q = T*{p) G G*{Y) under the channel, p and q are probability distributions 
on X respectively Y and px can be interpreted as the probability that the "letter" 
X Cz X was send. Similarly qy = T^yPx is the probability that y £ Y was received 
and Pxy ~ TxyPx is the probability that x £ X was sent and y £ Y was received. 
The family of all Pxy can be interpreted as a probability distribution P on X x Y 
and the Txy can be regarded as conditional probability of P under the condition x. 
Now we can introduce the mutual information 

Iip,T)^S{p) + S{q)-S{P)= J2 P.y^OgJ^), (6.21) 



{x,y)eXxY 



Pxqy 



where S{p), S{q) and S{P) denote the entropies of p,q and P. The mutual infor- 
mation describes, roughly speaking, the information that p and q contain about 
each other. E.g. if p and q are completely uncorrelated (i.e. Pxy — Pxqy) we get 
I{p, T) = 0. If T is on the other hand an ideal bit-channel and p equally distributed 
we have I{p,T) = 1. Now we can state Shannons Theorem which expresses the 



classical capacity of T in terms of mutual informations |138| 



Theorem 6.2.1 (Shannon) The classical capacity ofCc{T) of a classical commu- 
nication channel T : Q{Y) G{X) is given by 

C,{T)=snpI{p,T), (6.22) 
p 

where the supremum is taken over all states p G C*{X). 

6.2.2. Quantum channels. — If we transmit classical data through a quantum 
channel T : 23 (J{) 23 (5{) the encoding E : 23 (J{) — * 62 is a parameter dependent 
preparation and the decoding : C2 — > 23 (!K) is an observable. Hence the composi- 
tion ETD is a channel 62 ^ 62, i-e. a purely classical channel and we can calculate 



its capacity in terms of Shannons Theorem (Theorem 3.2.1). This observation leads 



to the definition of the "oTie-s/iot" classical capacity of T: 

CcAT)^ sup Cc{ETD), (6.23) 

E,D 

where the supremum is taken over all encodings and decodings of classical bits. The 
term "one-shot" in this definition arises from the fact that we need apparently only 
one invocation of the channel T. However many uses of the channel are hidden in 
the definition of the classical capacity on the right hand side. Hence Cc,i{T) can 
be defined alternatively in the same way as Cc(T) except that no enta nglem ent 
is allowed during encoding and decoding, or more precisely in Definition 6.1.l| we 
consider only encodings E : 23(3C)®*'^ Gf^ which prepare separable states and 
only decodings D : ^ 'B{3{)'^'^ which lead to separable observables. It is not 
yet known, whether entangled codings can help to increase the transmission rate. 
Therefore we only know that 

CcAT) < Cc{T) = sup i-C,,i(r«^) (6.24) 

holds. One reason why C'c.i{T) is an interesting quantity relies on the fact that we 
have, due to the following theorem by Holevo ||8^ a computable expression for it. 

Theorem 6.2.2 The one-shot classical capacity C'c.i{T) of a quantum channel T : 
'B(JC) 'B(5{) is given by 



C,AT)= sup 



(6.25) 
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where the supremum is taken over all probability distributions pj and collections of 
density operators pj . 

6.2.3. Entanglement assisted capacity. — Another classical capacity of a 
quantum channel arises, if we use dense coding schemes instead of simple encod- 
ings and decodings to transmit the data through the channel T. In other words we 
can define the entanglement enhanced classical capacity Ce(T) in the same way as 
Cc(T) but by replacing the encoding and decoding channels in Definition |6.1.1| and 
Equation ( |6.2D by dense coding protocols. Note that this implies that the sender 
Alice and the receiver Bob share an (arbitrary) amount of (maximally) entangled 
states prior to the transmission. 

For this quantity a coding theorem was proven recently by Bennett and others 
jl^ which we want to state in the following. To this end assume that we are trans- 
mitting systems in the state p e '£>*{%) through the channel and that p has the 
purification ^ £ 'K®'K, i.e. p = tri |*)(*| = tr2 |*)(*|. Then we can define the 
entropy exchange 

S{p,T) = s\{T®ld){\^){^)\. (6.26) 

The density operator (T (g) Id) (|\E')(vI'|) has the output state T*{p) and the input 
state p as its partial traces. It can be regarded therefore as the quantum analog of 



the input/output probability distribution T^y defined in Subsection 6.2.1 . Another 
way to look at S{p,T) is in terms of an ancilla representation of T: If T*{p) — 
trjc {Up® pjcU*) with a unitary U : 3-C (® OC and a pure environment state pjc it 
can be shown that S{p, T) = S [T^p] where Tjc is the channel describing the 
information transfer into the environment, i.e. T^{p) = trjc {Up® pxU*), in other 
words S{p, T) is the final entropy of the environment. Now we can define 

/(p, T) = S{p) + S{T*p) - S{p, T) (6.27) 

which is the quantum analog of the mutual information given in Equation (|6.2lD . 
It has a number of nice properties, in particular positivity, concavity with respect 
to the input state and additivity and its maximum with respect to p coincides 
actually with Ce{T) @. 

Theorem 6.2.3 The entanglement assisted capacity Ce{T) of a quantum channel 
T : S(a^) ^ S(at;) is given by 

Ce(r) =sup/(p,T), (6.28) 
p 

where the supremum is taken over all input states p G 23* (!K). 

Due to the nice additivity properties of the quantum mutual information I{p, T) 
the capacity Ce{T) is known to be additive as well. This implies that it coincides 
with the corresponding "one-shot" capacity, and this is an essential simplification 
compared to the classical capacity Cc(T). 



6.2.4. Examples. — Although the expressions in Theorem 6.2.2 and 6.2.3 are 
much easier then the original definitions they involve still some optimization prob- 
lems over possibly large parameter spaces. Nevertheless there are special cases which 
allow explicit calculations. As a first example we will consider the "quantum erasure 
channel" which transmits with probability l — 'd the d-dimensional input state intact 
while it is replaced with probability by an "erasure symbol", i.e. a, {d+ 1)*^ pure 
state ipe which is orthogonal to all others |7^. In the Schrodinger picture this is 

'B*(C'^) 3 p^T*{p) = {l- 79)/7 + i?tr(p)|V'e)(V'e| e S*(C'^+i). (6.29) 
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This example is very unusal, because all capacities discussed up to now (including 
the quantum capacity as we will see in Subsection 6.3.2) can be calculated explicitly: 
We get Cc,i(r) = Cc(T) = (1 - 1?) log2(rf) for the classical and Ce(T) = 2Cc(T) for 
the entanglement enhanced classical capacity 1^. Hence the gain by entangle- 



ment assistance is exactly a factor two; cf. Figure S.l 

2 



Ce{T) 

Cc{T) 



classical capacity 
ee. classical capacity 
quantum capacity 




Figure 6.1: Capacities of the quantum erasure channel plotted as a function of the 
error probability. 

Our next example is the depolarizing channel 



^*{C'')3 p^T*{p)^il-^)p- 



a 



(6.30) 



already discussed in Section It is more interesting and more difficult to study. It 
is in particular not known whether Cc and Cc,i coincide in this case (i.e. the value 
of Cc is not known. Therefore we can c ompar e Ce{T) only with with Cc,i - Using the 
unitary covariancc of T (cf. Subsection 3.2.2 ) we see first that I{UpU* ,T) = I{p, T) 
holds for all unitaries U (to calculate S{U pU* , T) note that U ® is a purifica- 
tion of UpU* if 5* is a purification of p). Due to the concavity of I{p, T) in the first 
argument we can average over all unitaries and see that the maximum in Equa- 
tion (5.28) is achieved on the maximally mixed state. Straightforward calculation 
therefore shows that 



Ce(T)=l0g2(d2)+ 1 



holds, while we have 



1 



log2 1-^9- 



1 



'^^^log2^ (6.31) 



Ce,l(T)=l0g2(d)+ 1-7? 



log2 1-^9- 



^^log2^, (6.32) 



where the maximum in Equation (6.25) is achieved for an ensemble of equiprobable 
pure states taken from an orthonormal basis in flsl] . This is plausible since the first 



term under the sup in Equation (6.25) becomes maximal and the second becomes 
minimal: PjT* pj is maximally mixed in this case and its entropy is therefore 
maximal. The entropies of the T* pj are on the other hand minimal if the pj are pure. 
In Figure we have plotted both capacities as a function of the noise parameter 
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and in Figure 3.3 we have plotted the quotient Ce{T) / Cc.i{T) which gives an upper 
bound on 



the gain we get from entanglement assistance. 

2 I \ 1 1 — 



one-shot cl. capacity 
entanglement enhanced cl. capacity 




Figure 6.2: Entanglement enhanced and one-shot classical capacity of a depolarizing 
qubit channel. 
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Figure 6.3: Gain of using entanglement assisted versus unassisted classical capacity 
for a depolarizing qubit channel. 



As a third example we want to consider Gaussian channels defined in Subsec- 
tion 3.3.4 . Hence consider the Hilbert space % (K) describing a one-dimensional 
harmonic oscillator (or one mode of the electro magn etic field) and the amplifica- 
tion/attenuation channel T defined in Equation (3.74). The results we want to state 
concern a slight modification of the original definitions of Cc,i(T) and Ce(T): We 
will consider capacities for channels with constraint input. This means that only a 
restricted class of states p on the input Hilbert space of the channel are allowed for 
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encoding. In our case this means that we wiU consider the constraint tr^paa*) < N 
for a positive real number > and with the usual creation and annihilation opera- 
tors a*, a. This can be rewritten as an energy constraint for a quadratic Hamiltonian; 
hence this is a physically realistic restriction. 




k 



Figure 6.4: One-shot and entanglement enhanced classical capacity of a Gaussian 
amplification/attenuation channel with Nc = and input noise iV = 10 

For the e ntang lement enhanced capacity it can be shown now that the maximum 
in Equation ( |6.28|) is taken on Gaussian states. To get Ce(T) it is sufficient therefore 
to calculate the quantum mutual information I(T,p) for the Gaussian state pN from 



Equation (3.64). The details can be found in |84l and [h8|, we will only state the 



results here. With the abbreviation 

g{x) = (.T + 1) log2 {x + l)-x log2 X (6.33) 

we get S{pn) = g{N) and S{T[pn]) = giN') with N' = fc^iV + maxjO, P-l}+Nc 
(cf. Equation ( 3.75| )) for the entropies of input and output states and 



M.,r)..| ^ + ^-^-' )..f °-^+"-' | (6,34) 



with 



D = y/{N + N' + 1)2 - Ak^N{N + 1) (6.35) 
for the entropy exchange. The sum of all three terms gives Ce{T) which we have 



plotted in Figure 6.4 as a function of k 



To calculate the one-shot capacity Cc.i(T) the optimization in Equation (6.25) 
has to be calculated over probability distributions pj and collections of density oper- 
ators pj such that '^jPji'c{aa*pj) < N holds. It is conjectured but not yet proven 
that the maximum is achieved on coherent states with Gaussian probability 
distribution p{x) = {ttN)~^ exp{—\x\'^/N). If this is true we get 

Cc,i(T) = g{N') - g{Nl^) with = max{0, k^ - 1} + N^. (6.36) 



87 



6.3. The quantum capacity 




Figure 6.5: Gain of using entanglement assisted versus unassisted classical capacity 
for a Gaussian amplification/attenuation channel with Nc = and input noise 
N = 0.1,1,10 



6.3 The quantum capacity 

The quantum capacity of a quantum channel T : 23 23 is more difficult to 

treat than the classical capacities discussed in the last section. There is in particular 
no coding theorem available which would allow explicit calculations. Nevertheless 
there are partial results available, which we will review in the following. 

6.3.1. Alternative definitions. — Let us start with two alternative definitions 
of Cq{T). The first one proposed by Bennett Q differs only in the error quantity 
which should go to zero. Instead of the cb-norm the minimal fidelity is used. For a 
channel T : 23 23 and a subspace W (Z'K \i is defined as 

7p{-K', T) = inf (V-, T {^] V>) (6.37) 

and if J{' = J{ holds we simply write 'Jp{T). Hence a number c is an achievable rate 
if 

lim JpiEjT^^'Wj) = 1 (6.38) 

holds for sequences 

Ej : ^(M)®*'^^ -> M^^^ T)^ : M®^^ S(5{)®*^^- , j G N (6.39) 

of encodings and decodings and sequences of integers AIj ,Nj,jeN satisfying the 
same constraints as in Definition |6.1.l| (in particular lim,-_>oo N^/ Mj < c). The 



equivalence to our version of Cq(T) follows now from the estimates [169 



||T-Id|| <||r-Id||cb<4v/||T-Id|| (6.40) 



r-Id|| <4:Jl-Jp{T) <4V||r-Idl|. (6.41) 
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A second version of Cq (T) is given in ||^ . To state it let us define first a quantum 
source as a sequence pN', iV e N of density operators pN & 'B*{X'^^) (with an appro- 
priate Hilbert space 3C) and the entropy rate of this source as hmsup^_,g^ S{pn)/N. 
In addition we need the entanglement fidelity of a state p (with respect to a channel 
T) 

Je(p, T) = (T (g> Id) [\^){^\\ (6.42) 

where 5* is the purification of p. Now we define c > to be achievable if there is a 
quantum source pN, N £ N with entropy rate c such that 

lim 9e{pN, E'^T^^D'^) - 1 (6.43) 

n — »oo 

holds with encodings and decodings 

E'r^ : S(J{)«'^ ^ S(3C«'^), 2)'^ : S(3C^^) ■B(JC)®^, j e N. (6.44) 

Note that these E'j^, D'j^ play a slightly different role then the Ej, Dj in Equation 
( |6.39 ) (and in Definition |6.1.1 ), because the number of tensor factors of the input 



and the output algebra is always identical, while in Equation ( 6.39| ) the quotients 



of these numbers lead to the achievable rate. To relate both definitions we have 
to derive an appropriately chosen family of subspaces C 3C®^ from the pN 
such that the minimal fidelities 3'p{JCj^, E'^T'^'^ D'jy) of these subspaces go to 1 as 
— > c». If we identify the with tensor products of and the Ej, Dj of 



Equation ( 3.3£ ) with restrictions of E'j^, D'^ to these tensor products we recover 



Equation ( 3.3j ). A precise implementation of this rough idea can be found in 



and it shows that both definitions just discussed are indeed equivalent. 

6.3.2. Upper bounds and achievable rates. — Although there is no coding 
theorem for the quantum capacity Cq(T), there is a fairly good candidate which is 
related to the coherent information 

J{p,T)^S{T*p)-S{p,T). (6.45) 

Here S{T* p) is the en tropy of the output state and S{p, T) is the entropy exchange 
defined in Equation ( |6.26| ). It is argued that J{p,T) plays a role in quantum 
information theory which is analogous to that of the (classical) mutual information 
( |6.21 ) in classical information theory. J(p, T) has some nasty properties, however: 



it can be negative and it is known to be not additive . To relate it to Cq (T) 
it is therefore not sufficient to consider a one-shot capacity as in Shannons Theorem 
(Thm |6.2.lD . Instead we have to define 



CsiT)=snp^Cs,i{T^^) with C7,,i(r) = sup J(p,T). (6.46) 

TV p 

In and |] it is shown that Cs(r) is an upper bound on Cq{T). Equality, however, 
is conjectured but not yet proven, although there are good heuristic arguments 



1101, 



A second interesting quantity which provides an upper bound on the quantum 
capacity uses the transposition operation O on the output systems. More precisely 
it is shown in [|4| that 

c,(r)<c7e(r) = iog2||re|icb (6.47) 

holds for any channel. In contrast to many other calculations in this field it is 
particular easy to derive this relation from properties of the cb-norm. Hence we are 
able to give a proof here. We start with the fact that ||0||cb — diidis the dimension 
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of the Hilbert space on which Q operates. Assume that Nj/Mj — > c < Cq{T) and 
j large enough such that || Id^^ —EjT®^'^^Dj\\ < e with appropriate encodings and 
decodings Ej , Dj . We get 

2^^ = II Id^^ eilcb < ||e(Id^^- -E,T^^'W,)U + \\eE,T^^'w,u (6.48) 
< 2^^ II Id^^' -EjT^^'^WjlU + \\QEje{eTf^^W,\\,y, (6.49) 
<2^^e+||eT||f^^ (6.50) 

where we have used for the last equation the fact that Dj and QEjQ are channels 
and that the cb-norm is multiplicative. Taking on both sides the logarithm we get 

In the limit j oo this implies c < logj ||0r|| and therefore Cq{T) < logj ||Or||cb = 
Ce{T) as stated. 

Since Cg (T) is an upper bound on Cq (T) it is particularly useful to check whether 
the quantum capacity for a particular channel is zero. If, e.g. T is classical we have 
QT = T since the transposition coincides on a classical algebra Cd with the identity 
(elements of Cd are just diagonal matrices). This imphes Cg{T) = logj ||0T||cb = 
log2 ||T||cb = 0, because the cb-norm of a channel is 1. We see therefore that the 
quantum capacity of a classical channel is - this is just another proof of the 
no-teleportation theorem. A slightly more general result concerns channels T = 
RS which are the composition of a preparation R : Md ^6/ and a subsequent 
measurement S : Cf ^ M^. It is easy to see that &T — QRS is a channel, because 
QRQ is a channel and is the identity on C/, hence QRQ = QR and QRQS = 
QRS = er. Again we get Ce{T) = 0. 

Let us consider now some examples. The most simple case is again the quantum 
erasure channel from Equation ( |6.29 ). As for the classical capacities its quantum ca- 



pacity can be exphcitly calculated ||15| and we have Cq{T) = max(0, (1— 2t?) log2(d)); 



cf. Figure 3.1. 

For the depolarizing channel ( |6.30|) precise calculations of Cq{T) are not availail- 
able. Hence let us consider first the coherent information. J(T, p) inherits from T 
its unitary covariance, i.e. we have J(U pU* , T) = J{p, T). In contrast to the mutual 
information, however, it does not have nice cocavity properties, which makes the 
optimization over all input states more difhcult to solve. Nevertheless, the calcu- 
lation of J(/9, T) is straightforward and we get in the qubit case (if "d is the noise 
parameter of T and A is the highest eigenvalue of p): 



2 J V 2 J V 2 



2 J \ 2 

where S{x) = — a;log2(a;) denotes again the entropy function and 



(6.52) 



A = v/(2A - 1)2(1 - §/2f + 4A(1 - A)(l - i9)2. (6.53) 

Optimization over A can be performed at least numerically (the maximum is at- 
tained at the left boundary (A — 1/2) if J is positive there, and the right boundary 
otherwise) . The result is plotted together with Cg (T) in Figure as a function of 
9. The quantity Cg{T) is much easier to compute and we get 

Cg{T) - max{0, log2 (2 - h) }. (6.54) 
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To get a lower bound on Cq{T) we have to show that a certain rate r < Cq{T) 
can be achieved with an appropriate sequence 

Em : Mf ^ ^ M^""^^^ , M, N{M) e N (6.55) 



of error correcting codes and corresponding decodings Dm- I.e. we need 

lim N{M)/M = r and lim \\EmT®^Dm - Id ||cb = 0. (6.56) 

j—^oc j—*oc 



To find such a sequence note first that we can look at the depolarizing channel 
as a device which produces an error with probability and leaves the quantum 
information intact otherwise. If more and more copies of T arc used in parallel, i.e. 
if M goes to infinity, the number of errors approaches therefore 'dM. In other words 
the probability to have more than "ffM errors vanishes asymptotically. To see this 
consider 



M 

T®^= ((^?-l)Id+M-Hr(-)l)®'''= ^(l-^)^^^-^r^'^^ (6.57) 

where denotes the sum of all M-fold tensor products with tr( ■)! on N 

places and Id on the N — K remaining - i.e. Tj^^ is a channel which produces 
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Figure 6.6: Co(T), Cs{T) and the Hamming bound of a depolarizing qubit channel 
plotted as function of the noise parameter 
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exactly K errors on M transmitted systems. Now we have 



r 



E 

K<-dM 



K 



J2 (1 - 



K>{IM 
M 



K>^M 
M 

^ E 

K>'&M 



K.N-K 



= R. 



(6.58) 
(6.59) 
(6.60) 
(6.61) 



The quantity R is the tail a of Bi nom ial series and vanishes therefore in the limit 
M ^ oo (of. e.g. Appendix B of |131|). This shows that for M ^ oo only terms 

(M) 
K 



TiJ-"' with K < -dM are relevant in Equation (6.57) 



in other words at most dM 
errors occur asymptotically, as stated. This implies that we need a sequence of codes 
Em which encode N{M) qubits and correct dM errors on M places. One way to 
get such a sequence is "random coding" - the classical version of this method is 
well known from the proof of Shannons theorem. The idea is, basically, to generate 
error correcting codes of a certain type randomly. E.g. we can generate a sequence 
of random graphs with N{M) input and M output vertices (cf. Section 4.4). If we 
can show that the corresponding codes correct (asymptotically) dM errors, the cor- 
responding rate r = limM^oo N{M)/M is achievable. For the depolarizing channel^ 
such an analysis, using randomly generated stabilizer codes shows Eq, VTm 



(6.62) 



where H is the binary entropy from Equation (5.16). This bound can be further 
improved using a more clever coding strategy; cf. ||54|. 

As a third example let us consider again the Gaussian channel studied already 



in Subsection 3.2.4. For Cg{T) we have (the corresponding calculation is not trivial 
and uses properties of Gaussian channels which we have not discussed; cf. [p4[.) 



Ce{T) = max{0,log2(fc2 + 1) - \og^{\k^ - 1| + 27V,)}, 



(6.63) 



and we see that Cg{T) and therefore Cq{T) become zero if is large enough 
(i.e. Nc > max{l,fc^}). The coherent information for the Gaussian state Pm from 
Equation ( [3.64 ) has the form 



J{pN,T)^g{N')~gl^ 



D + N' - N -1 



D - N' + N -1 



(6.64) 



with N' , D and g as in Subsection 6.2.4. It increases with N and we can calculate 



therefore the maximum over all Gaussian states (which might differ from Cs{T)) 
as 



Cg{T) ^ lim J{pN,T) = log2 fc2 _ iog2 - 1| - , 



(6.65) 



We have plotted both quantities in Figure 3.7 as a function of k 



Finally let us have a short look on the special case fc = 1, i.e. T describes in this 
case only the influence of classical Gaussian noise on the transmitted qubits. If we set 

^With a more thorough discussion similar resuhs can be obtained f or a, p uir.h more general class 
of channels, e.g. all T in a neighbourhood of the identity channel; cf. |114|. 
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Figure 6.7: Ce{T) and Cs{T) of a Gaussian amplification/attenuation channel as a 
function of amplification parameter k. 



= 1 in Equation ( |6.64D and take the limit N ^ oo we get Cg{T) = — \og2{Nce) 
and C e (T ) becomes Ce{T) — max{0, — log2(iVc)}; both quantities are plotted in 
Figure |6.q . This special case is interesting because the one-shot coherent information 
Cg{T) is achievable, provided the noise parameter Nc satisfies certain conditions^ 
[frzl . Hence the re is strong evidence that the quantum capacity lies between the two 
lines in Figure 3. 



6.3.3. Re lation s to entanglement measures. — The duality lemma proved in 
Subsection 2.3.3 provides an interesting way to derive bounds on channel capacities 
and capacity like quantities from entanglement measures (and vice versa) po[ : 
To derive a state of a bipartite system from a channel T we can take a maximally 
entangled state ^' G (g) J{, send one particle through T and get a less entangled 
pair in the state px — (Id(8)T*)|\E')('I'|. If on the other hand an entangled state 
p G §{J-C (X> !K) is given, we can use it as a recource for teleportation and get a 
channel Tp. The two maps p t—^ Tp and T i-^ pT are, however, not inverse to 



one another. This can be seen easily from the duality lemma (Theorem [2.3.4 ): 
For each state p E §(J{ (S) !K) there is a channel T and a pure state $ G IH (8) 3^ 
such that p — (Id (g)T* )!$)($ I holds; but $ is in general not maximally entangled 
(and uniquely determined by p). Nevertheless, there are special cases in which the 
state derived from Tp coincides with p: A particular class of examples is given by 
teleportation channels derived from a Bell-diagonal state. 

On pt we can evaluate an entanglement measure E{pt) and get in this way a 
quantity which is related to the capacity of T. A particularly interesting candidate 
for E is the "one-way LOCC" distillation rate Ed,^- It is defined in the same way 
as the entanglement of distillation Ed, except that only one-way LOCC operation 



are allowed in Equation ( p.8[ ). According to [g^ Ed^^ is related to Cq by the 
inequalities Ed,^{p) > Cq{Tp) and Ec^iTp) < Cq{T). Hence if pr^ = p we can 
calculate Ejj^_{p) in terms of Cq{Tp) and vice versa. 



^It is only shown that log2([l/(A^ce)J) can be achieved, where \x\ denotes the biggest integer 
less than x. It is very likely however that this is only a restriction of the methods used in the proof 
and not of the result. 
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Ce{T) 



One-shot coherent information 
Transposition bound 



Figure 6.8: Cg{T) and Cs{T) of a Gaussian amplification/attenuation channel as a 
funetion of the noise parameter Nc (and with k = 1). 



A second interesting example is the transposition bound Cg{ T) introduced in 
the last subsection. It is related to the logarithmic negativity |15q| 



Ee{pT) =\og^\\{lA®Q)pT\\i, 



(6.66) 



which measures the degree with which the partial transpose of p fails to be positive. 
Eg can be regarded as entanglement measure although it has some drawbacks: it is 
not LOCC monotone (Axiom E2), it is not convex (Axiom E3) and most severe: It 
does not coincides with the reduced von Neumann entropy on pure states, which we 
have considered as "f/ie" entanglement measure for pure states. On the other hand 
it is easy to calculate and it gives bounds on distillation rates and teleportation 



capacities [15S|. In addition Eg can be used together with the relation between 



depolarizing channels and isotropic states to derive Equation (3.54) in a very simple 
way. 



Chapter 7 
Multiple inputs 



We have seen in Chapter ^ that many tasks of quantum information which are 
impossible with one-shot operations can be approximated by channels which operate 
on a large number of equally prepared inputs. Typical examples are approximate 
cloning, undoing noise and distillation of entanglement. There are basically two 
questions which are interesting for a quantitative analysis: First we can search for 
the optimal solutions for a fixed number N of input systems and second we can ask 
for the asymptotic behavior in the limit N ^ oo. In the latter case the asymptotic 
rate, i.e. the number of outputs (of a certain quality) per input system is of particular 
interest. 

7.1 The general scheme 

Both types of questions just mentioned can be treated (up to certain degree) in- 
dependently from the (impossible) task we are dealing with and we will study 
in the following the corresponding general scheme. Hence consider a channel 
T : ^(Jf®*^) S(3<«'^) which operates on N input systems and produces M 
outputs of the same type. Our aim is to optimize a "figure of merif ^{T) which 
measures the deviation of T*{p^'^) from the target functional we want to approxi- 
mate. The particular type of device we are considering is mainly fixed by the choice 
of 3^(T) and we will discuss in the following the most relevant examples. (Note that 
we have considered them already on a qualitative level in Chapter U; cf. in particular 



Section 4.2 and 4.3) 



7.1.1. Figures of merit. — Let us start with pure state cloning |68, p2 



|35| , 167 , P8[ , i.e. for each (unknown) pure input state a = \ip){'ip\, ip G K the M 
clones r*(o-®^) produced by the channel T should approximate M copies of the 
input in the common state a"^^^ as good as possible. There are in fact two different 
possibilities to measure the distance of T*{a'^'^) to cr'*^^. We can either check the 
quality of each clone separately or we can test in addition the correlations between 
output systems. With the notation 

^ij) ^ ^ cr ® e S(J{®*^) (7.1) 

a figure of merit for the first case is given by 

S'ciiT)^ inf inf tr(a(^)r*((T®^)). (7.2) 

' J — 1,... cr pure 

It measures the worst one particle fidelity of the output state T*(o'®^). If we are 
interested in correlations too, we have to choose 

^cMT) = inf tr((7«^'^r*(r7®^)) (7.3) 

(T pure 

which is again a "worst case" fidelity, but now of the full output with respect to M 
uncorrelated copies of the input a. 

Instead of fidelities we can consider other error quantities like trace-norm dis- 
tances or relative entropies. In general, however, we do not get significantly different 
results from such alternative choices; hence we can safely ignore them. Real vari- 
ants arise if we consider instead of the infima over all pure states quantities which 
prefer a (possibly discrete or even finite) class of states. Such a choice leads to 
"state dependent cloning" , because the corresponding optimal devices perform bet- 
ter as "universal" ones (i.e. those described by the figures of merit above) on some 
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states but much worse on the rest. We ignore state dependent cloning in this work, 
because the universal case is physically more relevant and technically more chal- 
lenging. Other cases which we do not discuss either include "asymmetric cloning" , 



which arises if we trade in Equation (7.2) the quality of one particular output sys- 



tem against the rest (see |^0|), and cloning of mixed states. The latter is much more 
difficult then the pure state case and even for classical systems, where it is related 
to the so called "bootstrap" technique jS^ , nontrivial. 

Closely related to cloning is purification, i.e. undoing noise. This means we are 
considering N systems originally prepared in the same (unknown) pure state a but 
which have passed a depolarizing channel 

R*a^^a+{l-d)l/d (7.4) 

afterwards. The task is now to find a device T acting on N of the decohered systems 
such that T*{R*(j) is as close as possible to the original pure state. We have the 
same basic choices for a figure of merit as in the cloning problem. Hence we define 

Jki(T)= inf inf iv(a'^'^T*\{R*af^]] (7.5) 

' j = l,--- ,N <y pure V -' / 

and 

7r..ix{T) = inf trfa«^T*[(i?V)®^]). (7.6) 

(7 pure \ / 

These quantities can be regarded as generalizations of Jc.i and "JcaW which we 
recover if R* is the identity. 

Another task we can consider is the approximation of a map Q which is positive 
but not completely positive, like the transposition. Positivity and normalization 
imply that 8* maps states to states but 9 can not be realized by a physical device. 
An explicit example is the universal not gate (UNOT) which maps each pure qubit 
state a to its orthocomplement . It is given the the anti-unitary operator 

^ = a|0) eV' = a|0) -/3|1). (7.7) 

Since Qa is a state if a is, we can ask again for a channel T such that T*{a^^) 
approximates (8ct)®*^. As in the two previous examples we have the choice to allow 
arbitrary correlations in the output or not and we get the following figures of merit: 

?e.i(T)= inf inf ir{{Qaf^^T*{a®^)) (7.8) 

j — ,N (T pure 

and 

?e.aii(T)== inf tr((ea)®*^T*(a®^)). (7.9) 

(T pure 

Note that we can plug in for 8 basically any functional which maps states to st ates . 



In addition we can combine Equation (7^) and ( |7.6| ) on the one hand with (7 



and ( |7.9| ) on the other. As result we would get a measure for devices which undo 
an operation R and approximate an impossible machine 8 at the same time. 

7.1.2. Covariant operations. — All the functionals just defined give rise to 
optimization problems which we will study in greater detail in the next Sections. 
This means we are interested in two things: First of all the maximal value of 
(with ^ — c, R,9 and l] = 1, all) given by 



?#,h(A^,Af)=infJ#,^(r), 



(7.10) 
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where the supremum is taken over all channels T : 23 (IK*^*^) ^(JC®^), and second 
the particular channel T where the optimum is attained. At a first look a complete 
solution of these problems seems to be impossible, due to the large dimension of 
the space of all T, which scales exponentially in M and N. Fortunately all 3^^^\f{T) 
admit a large symmetry group which allows in many cases the explicit calculation of 
the optimal values 'J^^^{N, M) and the determination of optimizers T with a certain 
covariance behavior. Note that this is an immediate consequence of our decision to 
restrict the discussion to "universal" procedures, which do not prefer any particular 
input state. 

Let us consider permutations of the input systems first: If p G Sat is a permu- 



tation on N places and Vp the corresponding unitary on "K®^ (cf. Equation (3.7)) 
we get obviously T*{Vpp'^^V*) = T*{p®^), hence 

[ap{T)\ = %,^(r) Vp e Sat with [a.p{T)\ {A) - V;T{A)Vp. (7.11) 

In other words: 'J^^i^{T) is invariant under permutations of the input systems. Sim- 
ilarly we can show that "J^^^lT) is invariant under permutations of the output 
systems: 

[/3p(T)] = J(T) Vp e Sm with [PpiT)] {A) = T{y;AVp). (7.12) 

To see this consider e.g. ioi ^ ~ c and t] = all 

tT[a^^^VpT*{p'^^)Vp*] = ti[Vpa^^^V;T*{p^^)] = tr[a®^'^r*(p®^)] . (7.13) 

For the other cases similar calculations apply. 

Finally, none of the ?':^^t](T) singles out a preferred direction in the one particle 
Hilbert space JC. This implies that we can rotate T by local unitaries of the form 
jj^N respectively {7®^ without changing S'^^\f{T). More precisely we have 

[ju{T)] = J#,^(T) VC/ e U(d) (7.14) 



with 



The validity of Equation ( [7.14| ) can be proven in the same way as ( 7.11 ) and ( 7.12 ). 
The details are therefore left to the reader. 

Now we can average over the groups S'jv, Sm and \]{d). Instead of the operation 
T we consider 

^=]^E E / ^pMu{T)dU, (7.16) 

where dU denotes the normalized, left invariant Haar measure on U(c?). We see 
immediately that T has the following symmetry properties 

ap{f) = T, /3,(T) = T, ^u[f) ^T,ype Sm, Vg e Sm, VC/ e V{d) (7.17) 

and we will call each operation T fully symmetric, if it satisfies this equation. The 
concavity of 9^#,ti implies immediately that it can not decrease if we replace T by 
T: 

%,ar)-%J^E E / '^pP,luiT)du\ (7.18) 

\ ■ ■ pGSn Q&Sm ■'^ I 

^]^E E / %,^M,7c/(T)]dC/ = ?#,h(T). (7.19) 
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To calculate the optimal value 3^^.^(A^, Af) it is therefore completely sufficient to 
search a maximizer for 3^^^if{T) only among fully symmetric T and to evaluate 
?#:,^(T) for this particular operation. This simplifies the problem significantly be- 
cause the size of the parameter space is extremely reduced. Of course we do not 
know from this argument whether the optimum is attained on non-symmetric oper- 
ations, however this information is in general less important (and for some problems 
like optimal cloning a uniqueness result is available). 

7.1.3. Group representations. — To get an idea how this parameter reduction 
can be exploited practically, let us reconsider Theorem p. 1.1 : The two representa- 
tions U I— > [/®^ and p 1-^ Vp of \J{d) respectively S^r on5{®^ are "commutants" 
of each other, i.e., any operator on JC®^ commuting with all U'^^ is a linear com- 
bination of the Vp, and conversely. This knowledge can be used to decompose the 
representation {/®^ (and Vp as well) into irreducible components. To reduce the 
group theoretic overhead, we will discuss this procedure first for qubits only and 
come back to the general case afterwards. 

Hence assume that !K = holds. Then is the Hilbert space of N (distin- 

guishable) spin- 1/2 particles and it can be decomposed in terms of eigenspaces of 
total angular momentum. More precisely consider 



^fc = ^E'^i'^ /c = 1,2,3 



(7.20) 



the /c-component of total angular momentum (i.e. aj. is the k*"^ Pauli matrix and 
cr^-') e ^(JC®^) is defined according to Equation (|0)) and P 



J2k ^k- 



The 



eigenvalue expansion of is well known to be 



L = ^s{s + l)Ps, with s 



0, 1, . . . ,N/2 N even 

1/2,3/2,... ,7V/2 A^odd 



(7.21) 



where the Pg denote the projections to the eigenspaces of L^. It is easy to see 
that both representations U i-^ jj^n ^^^^ p ^ commute with L. Hence the 
eigenspaces Ps of are invariant subspaces of and Vp and this implies 
that the restriction of [7®^ and Vp to them are representations of SU(2) respectively 
Sat. Since is constant on P^IK®^ the SU(2) representation we get in this way 
must be (naturally isomorphic to) a multiple of the irreducible spin-s representation 
TTs- It is defined by 



exp 1 2^fc 



= cxp I iL 



with = 
fe 2 



1 (. 



3 = i 



(7.22) 



52s 



on the representation space 

'Jig — !K! 

(the Bose-subspace of Jf®^*). Hence we get 



(7.23) 



(7.24) 



Since Vp and [/®^ commute the Hilbert space Xn,s carries a representation 7fiv,s(p) 
of Sat which is irreducible as well. Note that Xn,s dep ends in contrast to 'Kg on the 
number N of tensor factors and its dimension is (see |10C] or |142] for general d) 



dim3Cjv.s = 



2s + 1 



N/2 + S + 1 \N/2 - s 



N 



(7.25) 
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Summarizing the discussion we get 

r^^N ^ ^ Xn,s, C/®^ 2^ 7r,([/) I, - 1 7f(p). (7.26) 

s s s 

Let us consider now a fully symmetric operation T. Permutation invariance 
(ap(r) = T and /3p(r) = T) implies together with Equation ( fl^ ) that 



tr(i3,-) 
dim3C 



N,3 



■Tsj{Aj 



with : ^(Jfj) ^ S(:K,), (7.27) 



holds if Aj (g) g 23(3^j fS) ^^atj )- The operations Tgj are unital and have, according 
to 7(7 (T) = T the following covariance properties 



Trs{U)T{Aj)TTs{U*) = r[7rj(C/)Aj7rj ([/*)] Vt/ G SU(2). 



(7.28) 



The classification of all fully symmetric channels T is reduced therefore to the study 
of all these Tc, . 



We can apply now the covariant version of Stinespring's theorem (Theorem 3.2.2 
to find that 



Tsj{Aj) = V*{Aj (g) 1)V, V ■.:Ks^3{j^ J{, Vtts{U) = 7Tj{U) ® n{U)V, (7.29) 

where tt is a representation of SU(2) on 3i. If tt is irreducible with total angular 
momentum I the "intertwining operator" V is well known: Its components in a 
particularly chosen basis concidc with certain Clebsh-Gordon coefficients. Hence 
the corresponding operation is uniquely determined (up to unitary equivalence) 
and we write 



TsjiiA,) = [Vi{A, ® l)Vi] , Vin.iU) = ^,(C/) » 7ri{U)Vi 



(7.30) 



where I can range from |j — s| to j + s. Since a general representation tt can be 
decomposed into irreducible components we see that each covariant T^j is a convex 



linear combination of the Tsji and we get with Equation (7.27) 



T{A, B,) = 



^c,,[T,,,(A,)®(tr(S,)I) 



L I 



(7.31) 



where the Cji are constrained by Cji > and J^j^ji — (dimJCjvj)^^. In this way 
we have parameterized the set of fully symmetric operations completely in terms 
of group theoretical data and we can rewrite 3^^,\f{T) accordingly. This leads to an 
optimization problem for a quantity depending only on s.j and I, which is at least 
in some cases solvable. 

To generalize the scheme just presented to the case !K = with arbitrary d 



we only have to find a replacement for the decomposition in Equation (7.26). This, 
however, is well known from group theory: 

:K'^^ ^^^Y^^Y, i7®^^07ry(t/)0l, Vp^^l<E)nY{p), (7.32) 



where ny : U(d) 'B(Jfy) and Try : ^ 'B{Xy) are irreducible representations. 
The summation index Y runs over all Young frames with d rows and N boxes, i.e., 
by the arrangements of N boxes into d rows of lengths Yi > Y2 > ■ ■ ■ > > with 
Yk = N. The relation to total angular momentum s used as the parameter for 
d = 2 is given by Yi — I2 = 2s, which determines Y together with Yi + Y2 — N 
completely. The rest of the arguments applies without significant changes, this is in 
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particular the case for Equatfon (7.31) which holds for general d if we replace s, j 
and I by Young frames. However, the representation theory of U(d) becomes much 
more difficult. The generalization of results available for qubits (d = 2) to d > 2 is 
therefore by no means straightforward. 

Finally let us give a short comment on Gaussian states here. Obviously the 
methods just described do not apply in this case. However, we can consider instead 
of C/®^-covariance, covariance with respect to phase-space translations. Following 
this idea some results concerning optimal cloning of Gaussian states are obtained 
(see and the refences therein), but the corresponding general theory is not as 
far developed as in the ffiiite dimensional case. 

7.1.4. Distillation of entanglement. — Finally let us have another look at dis- 
tillation of entanglement. The basic idea is quite the same as for optimal cloning: 
Use multiple inputs to approximate a task which is impossible with one-shot opera- 
tions. From a more technical point of view, however, it does not fit into the general 
scheme proposed up to now. Nevertheless, some of the arguments can be adopted 
in an easy way. First of all we have to replace the "one-particle" Hilbert space 
with a two- fold tensor product J-Ca !K_b and the channels we have to look at are 
LOCC operations 

T : ^(Jf®*^ (g, Jf|*^) ^ ^(Jf®^ (g, Jf|^); (7.33) 



cf. Section 4.3, Our aim is to determine T such that T*(p®^) is for each distillable 
(mixed) state p G 'B*(J{a «) Ms), close to the M-fold tensor product 
of a maximally entangled state VE" G 0-Ca g) 0-[b- A figure of merit with a similar 
structure as the 9^#,aii studied above ca n be d erived directly from the definition of 



the entanglement measure Ejj in Section 5.1.3: We define (replacing the trace-norm 
distance with a fidelity) 

3^jy{T) = inf inf(*®*^r*(p®^)*®*^) (7.34) 

where the infima are taken over all maximally entangled states 'if and all distillable 
states p. Alternatively we can look at state dependent measures, which seem to be 
particularly important if we try to calculate Ed (p) for some state p. In this case we 
simply get 

JdAT) = inf(^'®*^ (7 

To translate the group theoretical analysis of the last two subsections is somewhat 
more difficult. As in the case of 3^#,^ we can restrict the search for optimizers to 
permutation i nvaria nt operations, i.e. ap{T) = T and (3p{T) = T in the terminology 
of Subsection [7.1.2| . Unitary covariance 

however, can not be assumed for all unitaries U of J-Ca 'S) "Kb, but only for local 
ones {U = Ua ®Ub) in the case of 'Jd or only for local U which leave p invaria nt for 



"Jd^p- This makes the analogon of the decomposition scheme from Subsection 7.1.3 



more difficult and such a study is (up to my knowl edge) not yet done. A related 



subproblem arises if we consider 'Jd,p from Equation (7.35) for a state p with special 
symmetry properties; e.g. an 00-invariant state. The corresponding optimization 
might be simpler and a solution would be relevant for the calculation of Ejj. 

7.2 Optimal devices 

Now we can consider the optimization problems associated to the figures of merit 
discussed in the last section. This means that we are searching for those devices 
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which approximate the impossible tasks in question in the best possible way. As 
pointed out at the beginning of this Chapter this can be done for finite N and in 
the limit iV — > oo. The latter is postponed to the next section. 

7.2.1. Optimal cloning. — The quality of an optimal, pure state doner is defined 



by the figures of merit 3'c,# in Eq uations (|7.2| ) and ( |7.3D and the group theoretic 
ideas sketched in Subsection [7.1.3| allow the complete solution of this problem. We 
will demonstrate some of the basic ideas in the qubit case first and state the final 
result afterwards in full generality. 

The solvability of this problem relies in part on the special structure of the figures 
of merit S'c,#, which allows further simplifications of the general scheme sketched 
in Subsection 7.1.3 , If we consider e.g. 3^c.i{T) (the other case works similarly) we 
get: 



3-c,i(r) = inf inf tr(c7(j)r*(a®^)) 

j — l....,Ncr pure 

= inf inf tr(r(cr(^'))a®^)) 

j — ,N a pure 

= inf inf(V'®^,r(CT(^))V'®^). 

j = l,... ,N 



(7.37) 
(7.38) 
(7.39) 



Hence 3^c,# only depends on the 25 (5{®^) component (where denotes again 

the Bose-subspace of ?{®^) of T and we can assume without loss of generality that 
T is of the form 



T : ^(JC®*-^) ^ 'B(Jfl^). 



(7.40) 



The restriction of C/®^ to ^Kf^ is an irreducible representation (for any d) and in 
the qubit case {d = 2) we have C/®^V = ■^siU)i^ with s = N/2 for all V' e ■ The 
decomposition of T from Equation ( 7.27] ) contains therefore only those summands 
with s = N/2. This simplifies the optimization problem significantly, since the 
number of variables needed to parametrize all relevant cloning maps according to 



Equation (7.31) is reduced from 3 to 2. A more detailed (and non-trivial) analysis 



shows that the maximum for S'e.i and S^caii is attained if all terms in (7.31) except 
the one with s = N/2, j = N/2 and / = (M — N)/2 vanis h. T he precise result is 
stated in the following theorem (|^, ^ for qubits and [167, ^ for general d). 



Theorem 7.2.1 For each 'K — both figures of merit J'c,! o,n-d 3^c,aii i^e maxi- 
mized by the doner 



T*ip) 



d[N] 
d[M] 



(7.41) 



where d[N], d[AI] denote the dimensions of the symmetric tensor products J-C^^ 
respectively and Sm is the projection from to This implies for 

the optimal fidelities 



d-l N M + d 
d N + d M 



(7.42) 



J,,an(iV,M) 



d[N] 
d[M] ' 



(7.43) 



T is the unique solut ion f or both optimization problems, i.e. there is no other oper- 
ation T of the form ( 7.4(\) which maximizes 9^c,i or J'caii- 
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There are two aspects of this resuh which deserve special attention. One is the 



relation to state estimation which is postponed to Subsection 7.2.3. The second 



concerns the role of correlations: It does not matter whether we are looking for the 
quality of each single clone (S^c.i) only, or whether correlations are taken into account 
(J^c.aii)- In both cases we get the same optimal solution. This is a special feature of 
pure states, however. Although there are no concrete results for quantum systems, 
it can be checked quite easily in the classical case that considering correlations 
changes the optimal doner for arbitrary mixed states drastically. 

7.2.2. Purification. — To find an optimal purification device, i.e. maximizing 
is more difficult then the cloning problem, because the simplification from 
Equation ( 7.40 ) does not apply. Hence we hav e to c onsider all the summands in the 
direct sum decomposition of T from Equation (7.31) and solutions are available only 
for qubits. Therefore we will assume for the rest of this subsection that JC = 
holds. The SU(2) symmetry of the problem allows us to assume without loss of 
generality that the pure initial state ip coincides with one of the basis vectors. 
Hence we get for the (noisy) input states of the purifier 



p(/3) 



1 



2cosh(/3) 



exp 



1 







= tanh(/3)|i/')(V'| + (1 - tanh(/3))-I, V = |0) 



(7.44) 
(7.45) 



The parameterization of p in terms of the "pseudo-temperature" /3 is chosen here, 
because it simplifies some calculations significantly (as we will see soon). The 
relation to the form of p = R*(7 initially given in Equation (7.4) is obviously 
?9 = tanh(/3). 

To state the main result of this subsection we have to decompose the p roduc t 
state p{l3)'^^ into spin-s components. This can be do ne in terms of Equation ( 7.26] ). 
p{P) is not unitary of course. However we can apply (7.26) by analytic continuation, 
i.e. we treat p{P) in the same way as we would exp (ifia^). It is then straightforward 



3". 




Figure 7.1: One- and all-qubit fidelities of the optimal purifier for N = 100 and 
M = 10. Plotted as a function of the noise parameter 
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to get 



with 



and 



p^pfN ^ 0^^(,)^^(^) ^ , (7.46) 



sinh((2s + 

= ■ ur\)o ^TmW dini3<:^,„ 7.47 

sinn(p)(2 cosn(p)j™ 



where is the 3-component of angular momentum in the spin-s representation 



and the dimension of OCn.s is given in Equation (7.21). By (7.23) the representation 
space of TTs coincides with the symmetric tensor product . Hence we can interpret 
Ps{/3) as a state of 2s (indistinguishable) particles. In other words the decomposition 
of p(/3)®^ leads in a natural way to a family of operations 

: 'Bi^Kf') ^ a3(M®^), with Q: [p(/?)«^] = PsW- (7.48) 

We can think of the family Qg, of operations as an instrument Q which measures 
the number of output systems and transforms p{0)®^ to the appropriate Ps{P)- 
The crucial point is now that the purity of PsiP), measured in terms of fidelities 
with respect to ip increases provided s > 1/2 holds. Hence we can think of Q as 
a purifier which arises naturally by reduction to irreducible spin components 
Unfortunately Q does not produce a fixed number of output systems. The most 
obvious way to construct a device which produces always the same number M of 
outputs is to run the optimal 2s M doner T2s^m if 2s < M or to drop 2s — M 
particles if M < 2s holds. More precisely we can define Q : ^(Jf®*^) 'B(ai«'^) 

by 

Q*[p{Pr^] = 5]i«A.(s)f;_^,[p,(/3)], (7.49) 

S 

with 

f (7.50) 

[tr2s-Af p for M < 2s. 

tT2s-M denotes here the partial trace over the 2s — M first tensor factors. Applying 



the general scheme of S ubsection |7.1.3| shows that this is the best way to get exactly 
M purified qubits pO^ : 



Theorem 7.2.2 The operation Q defined in Equation (l.J^i) maximizes J'a.i and 
9^i?,aii- is called therefore the optimal purifier. The maximal values for S^r^i and 
S'fl.aii are given by 

3^rAN. M) = WNis)fi{M, P, s), 3^rMN, M) = J2 ^Nis)fMM, P, s) 

s s 

(7.51) 

with 



2/i(M,/3,s)-l 



^ ^- coth((2s + l)/3) - ^ coth P for 2s > M 



2s ' ' 2s 

1 M + 2 



2s + 2 M 



(2s + l)coth((2s + l)/?) -coth/?) for 2s 



(7.52) 



< M. 
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one-qubit fidelity 
all-qubit fidelity 



100 

N 



Figure 7.2: One- and all-qubit fidelities of the optimal purifier for d — 0.5 and 
M = 10. Plotted as a function of TV. 



and 



/all(M,/3,s) = <^ 



2s -t- 1 1 - e-^^ 
M+1 

1 - 



1 - e-(4'^+2)/3 \M 



^ \M 

K 



K 



M <2& 



^2r3(K-s) M>2s. 



(7.53) 



The expression for the optimal fidelities given here look rather complicated and 
are not very illuminating. We have plotted the re b oth quantities as a function of d 
(Figure lA) oi N (Figure [t^ ) and M (Figure 7^). While the first two plots looks 
quite similar the functional behavior in dependence of M seems to be very different. 
The study of the asymptotic behavior in the next Section will give a precise analysis 
of this observation. 



7.2.3. Estimating pure states. — We have already seen in Section 4.2 that the 
cloning problem and state estimation are closely related, because we can construct 
an approximate doner T from an estimator E simply by running E on the N input 
states, and preparing M systems according to the attained classical information. In 
this section we want to go the other way round and show that the optimal doner 



derived in Theorem 7.2.1 leads immediately to an optimal pure state estimator; cf. 
l3l. 



To this end let us assume that E has the form (cf. Section 4.2 ) 



(7.54) 



where X C 55* (IK) is a finite set|^ of pure states. The quality of E can be measured 



^The generalization of the following considerations to continuous sets and a measure theo- 
retic setup is straightforward and does not lead to a different result; i.e. we can not improve the 
estimation quality with continuous observables. 
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one-qubit fidelity 
aii-qubit fideiity 



100 
M 



Figure 7.3: One- and all-qubit fidelities of the optimal purifier for d = 0.5 and 
iV = 10. Plotted as a function of M. 



in analogy to Subsection [7.1.l| by a fidelity-like quantity 

%{E) = inf i^P^p^ij) = inf y^{ij'^^,E,i;^^){ij,aij) (7.55) 

where = X]<t(V''^^7 -E'cr^®")"' is the (density matrix valued) expectation value 
of E and the infimum is taken over all pure states ip. Hence S'siE) measures the 
worst fidelity of with respect to the input state ip. If we construct now a doner 
Te from E by 

nmi^f^) = (7.56) 

its one-particle fidelity 9^c.i(7£;) coincides obviously with 3^s{E). Since we can pro- 
duce in this way arbitrary many clones of the same quality we see that 3^s{E) is 
smaller than 3^c,i{N, M) for all M and therefore 

Js{E)<7,,^{N,^)= hm ycAN,M) (7.57) 
M-»oo d N + a 

where we can look at 3^c.iiN, oo) as the optimal quality of a doner which produces 
arbitrary many outputs from N input systems. 

To see that this bound can be saturated consider an asymptotically exact family 

eiXM) 3f^ E^'if) = J2 fi^)Ea e 23(J{®*0, Xm C S(J{) (7.58) 

(jex 



of estimators, i.e. the error probabilities ( 4.17 ) vanish in the limit N ^ oo. If the 
j^M g 'B(5{®*f) are pure tensor products (i.e. t he E^'^ are realized by a "quorum" 
of observables as described in Subsection 4.2.1 ) they can not distinguish between 
the output state T*(p®^) (which is highly correlated) and the pure product state 
~»M ^]^Qy-Q p g l}*(Jf) denotes the partial trace over M — 1 tensor factors (due 
to permutation invariance it does not matter which factors we trace away here). 
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Hence if we apply E^'^ to the output of the optimal iV to M doner Tn^m we 
get an estimate for p and in the limit M ^ oo this estimate is exact. The fidelity 
(■0, /M/;) of p with respect to the pure input state "0 of Tjv— ,Af coincides however with 
IciiN^ M). Hence the composition of Tn^m with E^'^ converge^ to an estimator 
E with J'eiE) = 5'c,i(^, oo). We can rephrase this result roughly in the from: 
"producing infinitely many optimal clones of a pure state is the same as estimating 
Tp optimally" . 

7.2.4. The UNOT gate. — The discussion of the last subsection shows that the 
optimal doner Tn^m produces better clones then any estimation based scheme 



(as in Equation ( 7.56 )), as long as we are interested only in finitely many copies. 
Loosely speaking we can say that the detour via classical information is wasteful and 
destroys too much quantum information. The same is true for the optimal purifier: 
We can first run an estimator on the mixed input state p{(3)®^ ^ apply the inverse 
{R*)~^ of the channel map to the attained classical data and reprepare arbitrary 
many purified qubits accordingly. The quality of output systems attained this way 



is, however worse, than those of the optimal purifier from Equation (7.4£) as long 
as the number M of output systems is finite; this can be seen easily from Figure |7.3| . 
In this sense the UNOT gate is a harder task than cloning and purification, because 
there is no quantum operation which performs better than the estimation based 
strategy. The following theorem can be proved again with the group theoretical 



scheme from Subsection 7.1.3 [B6 



Theorem 7.2.3 Let % = C^. Among all channels T : S(J{) ^ S(Mf^) the esti- 
mation based scheme just described attains the biggest possible value for the fidelity 
"3^9, namely 

•JeAN, 1) = JeMN, 1) = 1 - (7.59) 

The dependence on the number M of outputs is not interesting here,, because 
the optimal device produces arbitrary many copies of the same quality. 

7.3 Asymptotic behaviour 

If a device, such as the optimal doner, is given which produces M output sys- 
tem from N inputs it is interesting to ask for the maximal rate, i.e. the max- 
imal ratio M{N)/N in the limt N oo such that the asymptotic fidelity 
hmpf^oo 3^ {N, M{N)'^ is above a certain threshold (preferably equal to one). Note 
that this type of question was very important as well for distillation of entanglement 
and channel capacities, but almost not computable in there. In the current context 
this type of question is somewhat easier to answer. This relies on the one hand on 
the group theoretical structure presented in the last section and on the other on 
the close relation to quantum state estimation. We start this section therefore with 
a look on some aspects of the asymptotics of mixed state estimation. 

7.3.1. Estimating mixed state. — If we do not know a priori that the input 
systems are in a pure state much less is known about estimating and cloning. It is 
in particular almost impossible to say anything about optimality for finitely many 
input systems (only if N is very small e.g. |l56| ). Nevertheless some strong results 
are available for the behavior in the limit N oo and we will give here a short 
review of some of them. 

One quantity, interesting to be analyzed for a family of estimators E^ in the 
limit — > oo is the variance of the E^ . To state some results in this context it is 
convenient to parameterize the state space §(5{) or parts of it in terms of n real 

^Basically convergence must be shown here. It follows however easily from the corresponding 
property of the E*^. 
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parameters x = (xi,... ,Xn) — S C M" and to write p{x) as the corresponding 
state. If we want to cover all states, one particular parameterization is e.g. the 
generalized Bloch ball from Subsection [2.1.2 . An estimator taking N input systems 
IS now a (discrete) observable e :B(:K^^), x e Xn with values in a (finite) 
subset Xn of E. The expectation value of in the state p{x)^^ is therefore the 
vector {E^)x with components {E^)x;j, j ~ I, . . . ,n given by 

{E'').,- E y^HE^Pi-rn (7.60) 

yeXN 

and the mean quadratic error is described by the matrix 

V,1ix)= J2 {{EN).,-y,){{EN).,k-yt)tr{E^p{xr''). (7.61) 
yeXN 

For a good estimation strategy we expect that Vjk{x) decreases as 1/A^, i.e. 



V^^,{x) ^ ^Me) , (7.62) 



where the scaled mean quadratic error matrix Wjk{x) does not depend on N . The 
task is now to find bounds on this matrix. We will state here two results taken from 
|l66| . To this end we need the Hellstrdm quantum information matrix 

u I \ f r ^ ^ Aj(x)Afc(a;) - Afc(.T)Aj(x) ^ 

H'jk [x) = tr [p[x) J (7.63) 

which is defined in terms of symmetric logarithmic derivatives Xj , which in turn are 
implicitly given by 

dp{x) ^ Xj{x)p{x) + p{x)Xj{x) 
dxj 2 ■ ^ ' ' 

Now we have the following theorem |Q : 

Theorem 7.3.1 Consider a family of estimators E^ , N E N as described above 
such that the following conditions hold: 

1. The scaled mean quadratic error matrix NVjl{x) converges uniformly in x to 
Wjk (x) as N ^ oo. 

2. Wjk{x) is continuous at a point xq — x. 

3. Hjk{x) and its derivatives are bounded in a neighborhood of xq. 
Then we have 

tr[H-\xo)W~\xo)] <{d-l) (7.65) 

For qubits this bound can be attained by a particular estimation strategy which 
measures on each qubit separately. We refer to |Q for details. 

A second quantity interesting to study in the limit — s- oo is the error prob- 
ability defined in Section 4.2; cf. Equation (4.17). For a good estimation strategy 
it should go to zero of course, an additional question, however, concerns the rate 
with which this happens. We will review here a result from which concerns the 
subproblem of estimating the spectrum. Hence we are looking now at a family of 
observables E^ : e{XN) ^(JC®^), N eN taking their values in a finite subset 
Xjv of the set 

S = {(a;i,... ,Xd)eR''\xi>--->Xd>0,J2^Xj^l} (7.66) 
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of ordered spectra of density operators on Jf ~ C^. Our aim is to determine the 
behavior of the error probabihties (cf. Equation ( [4.17 ) 



KNiA)= tr(i?,^p«^) (7.67) 



in the limit A'' oo. Following the general arguments in Subsection 7.1.2 we can 



restrict our attention here to covariant observables, i.e. we can assume without loss 
of cloning quality that the commute with all permutation unitaries Vp, p G Sn 
and all local unitaries J7®^, U E U(d). If we restrict our attention in addition to 
projection valued measures, which is suggestive for ruling out unnecessary fuzziness, 
we see that each must coincide with a (sum of) projections Py from onto 
the U (d) respectively Vp invariant subspace HCy (8)3Cy, which is defined in Equation 
( 7.32| ), where Y = {Yi, . . . , Yd) refers here to Young frames with d rows and N boxes. 



The only remaining freedom for the E'^ is the assignment x{Y) e S of Young frames 
(and therefore projections E^) to points in E. Since the Young frames themselves 
have up to normalization the same structure as the elements of E, one possibility 
for s{Y) is just s{Y) = Y/N. Written as quantum to classical channel this is 

e{XN) 3 / ^ E fiY/N)PY e S(J{«^), (7.68) 



where Xn C E is the set of normalized Young frames, i.e. all Y/N if Y has d 
rows and N boxes. It turns out, somewhat surprisingly that this choice leads indeed 
to an asym ptotic ally exact estimation strategy with exponentially decaying error 



probability ( 7.67 ). The following theorem can be proven with methods from the 



theory of large deviations: 



Theorem 7.3.2 The family of estimators E^ , N (zfi given in Equation (7.68) is 
asymptotically exact, i.e. the error probabilities Kn(A) vanish in the limit N ^ oo 
if A is a complement of a ball around the spectrum r G T, of p. If A is a set (possibly 
containing r) whose interior is dense in its closure we have the asymptotic estimate 
forKM{A): 

lim ^In Kn{A) ^ inf Us), (7.69) 

where the "rate function" / : E ^ R is just the relative entropy between the two 
probability vectors s and r 



To make this statement more transparent, note that we can rephrase (7.69) as 



AV(A) w cxp y-N inf I{s)j . (7.71) 

Since the rate function / vanishes only for s = r we see that the probability measures 
Km converge (weakly) to a point measure concentrated at r G E. The rate of this 
convergence is exponential and measured exactly by the function /. 

7.3.2. Purification and cloning. — Let us come back now to the discussion of 
purification started in Subsection 7.2.2 (consequently we have J{ = again). Our 



aim is now to calculate the fidelities 'Jb.^^{N,M{N)') in the limit iV ^ oo for a 
sequence M{N), N eN such that M{N)/N converges to a value c S M. The crucial 
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step to do this is the apphcation of Theorem 7.3.2. The density matrices Ps{f3) from 



Equation (7.46) can be defined ahernatively by 

PsiP) ^— ^ = WNisy^PsPiPr^'Ps, wn{s) = tr(p(/3)®^F,) (7.72) 

where Pg is the projection from to IKg ^ ^n,s- In other words Pg is equal to 

Py from Equation ( [7.68D if we apply the reparametrization 

{Yi,Y2) ^ is,N) = {{Yi - Y2)/2,Yi+Y2). (7.73) 

In a similar way we can rewrite the set of ordered spectra by S 9 {xi,X2) 
xi — X2 & [0, 1] and K]y{A) becomes a measure on [0, 1] (i.e. A C [0, 1]): 

Kn{A)= ^ tr(p(/3)«^P,) - "^^(*) (7-74) 

2s/NeA 2s/NeA 

and the sum 

Jr^#{N,M{N)) =Y,wn{s)U{M{N),(3,s) (7.75) 

can be rephrased as the integral of a function [0,1] B x i-^ f^{N,f3,x) E R 
with respect to this measure, provided /# is related to by f^{N,P,2s/N) 



f^(^M{N),P,s). According to Theorem 7.3.2 the converge to a point measure 



concentrated at the ordered spectrum of p(/3); but the latter corresponds, accord- 
ing to the reparametrization above, to the noise parameter = tanh /?. Hence if 
the sequence of functions f^{N,(3, ■ ) converges for oo uniformly (or at least 

uniformly on a neighborhood of ?9) to /#(/?, • ) we get 

\im^ J{N, M{N)) - ^hrn^ ^ f#{N, f3, s) = /#(/3, ^) (7.76) 

s 

for the li mit o f the fidelities. A precise formulation of this idea leads to the following 
theorem [ |lOO| 



Theorem 7.3.3 The two purification fidelities Jr.^ have the following limits 



an 



d 



lim Mm 3^Ri{N,M)^l (7.77) 

N^oc M^oo 



2t92 

lfpL<d 



$(/.) = hm 7rMN. M) = \ 2^^2/2 " (7.78) 

If we are only interested in the quality of each qubit separately we can produce 
arbitrarily good purified qubits at any rate. If on the other hand the correlations 
between the output systems should vanish in the limit the rate is always zero. This 
can be seen from the function which is the asymptotic all-qubit fidelity which 



can be reached by a given rate p. We have plotted it in Figure 7.4. Note finally that 
the results just stated contain the rates of optimal cloning machines as a special 
case; we only have to set = 1. 
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Figure 7.4: Asymptotic all-qubit fidelity plotted as function of the rate /x. 
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